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RESULT 1 
AAE28936 

ID AAE28936 standard; protein; 377 AA. 
XX 

AC AAE28936; 
XX 

DT 27-JAN-2003 (first entry) 
XX 

DE Human sodium/bile-like transporter protein #1. 
XX 

KW Human; sodium/bile-like transporter; novel human protein; drug screening; 

KW NHP; cancer; cosmetic; nutriceutical ; gene therapy; cytostatic; 

KW chromosome 4. 
XX 

OS Homo sapiens. 
XX 



PN WO200272774-A2. 
XX 

PD 19-SEP-2002. 
XX 

PF 06-MAR-2002; 2002WO-US007438 . 
XX 

PR 12-MAR-2001; 2001US-0275009P . 

PR 17-APR-2001; 2001US-02 84 152P . 
XX 

PA (LEXI-) LEXICON GENETICS INC. 
XX 

PI Wilganowski NL, Nepomnichy B , Burnett MB, Hu Y; 
XX 

DR WPI; 2002-723334/78. 

DR N-PSDB; AAD46333. 
XX 

PT New protein and nucleic acid molecule, useful for diagnosing or treating 

PT diseases, e.g. cancer, for drug screening, clinical trial monitoring, 

PT pharmacogenomics, and for cosmetic or nutriceutical applications. 
XX 

PS Claim 4; Page 37-38; 41pp; English. 
XX 

CC The invention relates to novel human proteins (NHP) , sodium/bile-like 

CC transporter and their nucleic acids. The invention is useful for 

CC identifying the protein which may be used for diagnosis, clinical trial 

CC monitoring, drug screening, pharmacogenomics, treatment of diseases such 

CC as cancer, and for cosmetic or nutriceutical applications. The nucleic 

CC acid molecule may also be used as hybridisation probes for screening 

CC libraries, assessing gene expression patterns, and in amplification 

CC assays. It is also used in gene therapy. The present sequence is human 

CC sodium/bile-like transporter protein. The gene encoding this protein is 

CC located at chromosome 4 

XX 

SQ Sequence 377 AA; 

Query Match 100.0%; Score 1979; DB 5; Length 377; 
Best Local Similarity 100.0%; Pred. No. 3e-205; 

Matches 377; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 MRANCSSSSACPANSSEEELPVGLEVTiGNLELVFTWSTVTVMGLLMFSLGCSVEIRKLWS 60 

I I I I I I I I I I I I I I I I I I I I i I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1 MRANCSSSSACPANSSEEELPVGLEVHGNLELVFTWSTVMMGLLMFSLGCSVEIRKLWS 60 

Qy 61 HI RRPWGIAVGLLCQFGLMPFTAYLLAI S FSLKPVQAIAVLIMGCCPGGTI SNI FTFWVD 120 

I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I i I I I I I I I I I I I I I I I I I 

Db 61 HI RRPWGIAVGLLCQFGLMPFTAYLLAI SFSLKPVQAIAVLIMGCCPGGTI SNI FTFWVD 120 

Qy 121 GDMDLSISMTTCSTVAALGMMPLCIYLYTWSWSLQQNLTIPYQNIGITLVCLTIPVAFGV 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 121 GDMDLSISMTTCSTVAALGMMPLCIYLYTWSWSLQQNLTIPYQNIGITLVCLTIPVAFGV 180 

Qy 181 YWYRWPKQSKIILKIGAWGGVLLLWAVAGWLAKGSWNSDITLLTISFIFPLIGHVT 240 
I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I II I I I I 

Db 181 YVN YRW PKQSKIILKI GAWGGVL L L WAVAGWLAKG S WN SDITLLTISFIFPLI GHVT 240 



Qy 



241 GFLLALFTHQSWQRCRTISLETGAQNIQMCITMLQLSFTAEHLVQMLSFPLAYGLFQLID 300 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 



Db 241 GFLLALFTHQSWQRCRTISLETGAQNIQMCITMLQLSFTAEHLVQMLSFPLAYGLFQLID 300 



Qy 301 GFLIVAAYQTYKRRLKNKHGKKNSGCTEVCHTRKSTSSRETNAFLEVNEEGAITPGPPGP 360 

I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I 

Db 301 GFLI VAAYQT YKRRLKNKHGKKNS GCTEVCHTRKSTS S RETNAFLEVNEEGAITPGP PGP 360 

Qy 361 MDCHRALEPVGHITSCE 377 

I I I I I I I I I I I I I I I I I 
Db 361 MDCHRALEPVGHITSCE 377 



RESULT 2 
AAE29906 

ID AAE29906 standard; protein; 377 AA. 
XX 

AC AAE29906; 
XX 

DT 24-FEB-2003 (first entry) 
XX 

DE Human transporter and ion channel (TRICH) protein #6. 
XX 

KW Human; transporter and ion channel; TRICH; neurodegenerative disorder; 

KW Parkinson's disease; Alzheimer's disease; muscular disorder; transgenic; 

KW myotonic dystrophy; catatonia; endocrine disorder; diabetes; cytostatic; 

KW Grave's disease; cancer; leukaemia; cervical; immunological; scleroderma; 

KW systemic lupus erythematosus; allergy; gastrointestinal; Crohn's disease; 

KW Goodpasture's syndrome; infection; cardiovascular; fungicide; nootropic; 

KW hepatic disease; cirrhosis; gene therapy; uropathic; anti-HIV; virucide; 

KW atherosclerosis; antiparasitic; protozoacide; antibacterial. 
XX 



OS 


Homo sapiens. 




XX 






FH 


Key 


Location/Qualif iers 


FT 


Domain 


28. .56 


FT 




/note= "Transmembrane domain" 


FT 


Domain 


39. .220 


FT 




/note= "Sodium bile acid symporter domain 


FT 


Peptide 


41. .97 


FT 




/label= Signal peptide 


FT 


Domain 


69. .89 


FT 




/note= "Transmembrane domain" 


FT 


Domain 


95. .115 


FT 




/note= "Transmembrane domain" 


FT 


Protein 


98. .377 


FT 




/note= "Human mature TRICH protein" 


FT 


Domain 


131. .153 


FT 




/note- "Transmembrane domain" 


FT 


Domain 


159. .182 


FT 




/note= "Transmembrane domain" 


FT 


Domain 


191. .218 


FT 




/note= "Transmembrane domain" 


FT 


Domain 


220. .248 


FT 




/note= "Transmembrane domain" 


XX 






PN 


WO200277237-A2. 




XX 






PD 


03-OCT-2002. 





XX 

PF 08-FEB-2002; 2002WO-US003657 . 
XX 

PR 09-FEB-2001; 2001US-0267892P . 

PR 23-FEB-2001; 2001US-0271168P . 

PR 02-MAR-2001; 2001US-0272 890P . 

PR 16-MAR-2001; 2001US-0276860P . 

PR 23-MAR-2001; 2001US-0278255P . 

PR 30-MAR-2001; 2001US-0280538P . 

PR 25-JAN-2002; 2002US-0351359P . 
XX 

PA (INCY-) INCYTE GENOMICS INC. 
XX 

PI Lee EA, Ding L, Baughn MR, Tribouley CM, Bruns CM, Elliott VS; 

PI Walia NK, Forsythe IJ, Raumann BE, Burford N, Lai PG, Thornton M; 

PI Gandhi AR, Arvizu C, Yao MG, Yue H, Xu Y, Hafalia AJA, Ison CH; 

PI Chen H; 
XX 

DR WPI; 2003-018931/01. 

DR N-PSDB; AAD47353. 
XX 

PT New TRICH polypeptides, useful for diagnosing, preventing, and treating 

PT disorders associated with an abnormal expression or activity of TRICH, 

PT e.g. neuromuscular, immunological, cardiovascular disorders, cancer and 

PT infection. 
XX 

PS Claim 1; Page 158-159; 214pp; English. 
XX 

CC The invention relates to human transporters and ion channels (TRICH) and 

CC their nucleic acids. The sequences of the invention are useful in 

CC diagnosing, preventing, and treating disorders associated with an 

CC abnormal expression or activity of TRICH, such as neurodegenerative 

CC disorders (e.g. Parkinson's disease, Alzheimer's disease), muscular 

CC disorders (e.g. myotonic dystrophy, catatonia), endocrine disorders (e.g. 

CC diabetes, Grave's disease), cancers (e.g. leukaemia, cervical or breast 

CC cancers), immunological disorders (e.g. scleroderma, systemic lupus 

CC erythematosus, allergies), gastrointestinal disorders (e.g. Crohn's 

CC disease), renal disorders (e.g. Goodpasture's syndrome), infections (e.g. 

CC viral, bacterial, fungal, parasitic, protozoal, helminthic), 

CC cardiovascular disorders (e.g. atherosclerosis), or hepatic diseases 

CC (e.g. cirrhosis) . TRICH or its fragments may also be used in screening 

CC for compounds that specifically bind to and modulate its activity. TRICH 

CC DNA can be used to create humanised animals or transgenic animals to 

CC model human disease. It is also used in gene therapy. The present 

CC sequence is human TRICH protein 

XX 

SQ Sequence 377 AA; 

Query Match 99.5%; Score 1970; DB 6; Length 377; 

Best Local Similarity 99.5%; Pred. No. 2.8e-204; 

Matches 375; Conservative 0; Mismatches 2; Indels 0; Gaps 0; 

Qy 1 MRANCSSSSACPANSSEEELPVGLEVHGNLELVFTWSTVMMGLLMFSLGCSVEIRKLWS 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1 MRANCSSSSACPANSSEEELPVGLEAHGNLELVFTWPTVMMGLLMFSLGCSVEIRKLWS 60 



Qy 



61 HIRRPWGIAVGLLCQFGLMPFTAYLLAISFSLKPVQAIAVLIMGCCPGGTISNIFTFWVD 120 



I i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

Db 61 HI RRPWGI AVGLLCQFGLMP FTAYLLAI S FSLKPVQAIAVLIMGCCPGGTI SNI FT FWVD 120 



Qy 



121 GDMDL S I SMTT C S T VAALGMMP LC I YL YTWSW S LQQN LT I P YQN I G I T LVCLT I P VAFGV 18 0 




Db 



121 GDMDLSISMTTCSTVAALGMMPLCIYLYTWSWSLQQNLTIPYQNIGITLVCLTIPVAFGV 180 



Qy 



181 YVN YRWPKQSKIILKI G AWG G VL L L WAVAG WLAK G S WN SDITLLTISFIFPLIG H VT 240 




Db 



181 YVNYRWPKQSKI I LKI GAVVGGVLLLWAVAGVVLAKGSWNSDITLLTI SFI FPLI GHVT 240 



Qy 



241 GFLLALFTHQSWQRCRTISLETGAQNIQMCITMLQLSFTAEHLVQMLSFPLAYGLFQLID 300 




Db 



241 GFLLALFTHQSWQRCRTISLETGAQNIQMCITMLQLSFT7VEHLVQMLSFPLAYGLFQLID 300 



Qy 



Db 



301 GFLIVAAYQTYKRRLKNKHGKKNSGCTEVCHTRKSTSSRETNAFLEVNEEGAITPGPPGP 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I II I I I I I I I I I I I I I I I I 

301 GFLIVAAYQTYKRRLKNKHGKKNSGCTEVCHTRKSTSSRETNAFLEVNEEGAITPGPPGP 360 



Qy 



361 MDCHRALEPVGHITSCE 377 




Db 



361 MDCHRALEPVGHITSCE 377 



RESULT 3 
ABG76899 

ID ABG76899 standard; protein; 325 AA. 
XX 

AC ABG76899; 
XX 

DT 05-NOV-2002 (first entry) 
XX 

DE Human ileal sodium/bile acid cotransporter-like protein. 
XX 

KW Human; NOVX; cardiomyopathy; atherosclerosis; cell signal processing; 

KW breast cancer; Alzheimer's disease; epilepsy; Huntington's disease; 

KW anxiety; behavioural disorder; multiple sclerosis; myasthenia gravis; 

KW neurodegeneration; Parkinson's disease; pain; stroke; endometriosis; 

KW autoimmune disease; allergy; addiction; asthma; transplantation; 

KW graft versus host disease; systemic lupus erythematosus; scleroderma; 

KW psoriasis; Crohn's disease; HIV infection; human immunodeficiency virus; 

KW atherosclerosis; cirrhosis; rheumatoid arthritis; diabetes; pancreatitis; 

KW thrombocytopenia; bleeding disorder; metabolic disorder; obesity; 

KW glucose transport defect; glomerulonephritis; hypercalcaemia; 

KW polycystic kidney disease; renal tubular acidosis; skin disorder; 

KW congenital diarrhoea; respiratory disease; gastro-intestinal disease; 

KW muscle disorder; bone disorder; joint disorder; skeletal disorder; 

KW haematopoietic disorder; urinary system disorder; osteoporosis; 

KW dental disease; dental infection; growth disorder; reproductive disorder; 

KW hypogonadism; fertility disorder; viral infection; bacterial infection; 

KW parasitic infection; metabolic pathway modulation; gene therapy; 

KW zinc metalloprotease; ADAM-TS 7; alpha-2-macroglobulin precursor; 

KW ileal sodium/bile acid cotransporter ; prohibitin; MT; CIP4; spinesin; 

KW macrophage stimulating protein precursor; fatty acid-binding protein; 

KW gap junction beta-5 protein; hepsin/plasma transmembrane serine protease. 

XX 

OS Homo sapiens. 



XX 

PN WO200233087-A2. 
XX 

PD 25-APR-2002. 
XX 

PF 17-OCT-2001; 2001WO-US032496 . 
XX 

PR 17-OCT-2000; 2000US-0241040P . 

PR 17-OCT-2000; 2000US-0241058P . 

PR 17-OCT-2000; 2000US-0241063P . 

PR 17-OCT-2000; 2000US-0241243P . 

PR 20-OCT-2000; 2000US-0242152P . 

PR 23-OCT-2000; 2000US-0242482P . 

PR 23-OCT-2000; 2000US-0242611P . 

PR 23-OCT-2000; 2000US-0242612P . 

PR 24-OCT-2000; 2000US-0242880P . 

PR 24-OCT-2000; 2000US-0242881P . 

PR 29-DEC-2000; 2000US-0259028P . 

PR 20-FEB-2001; 2001US-02698 13P . 

PR 25-APR-2001; 2001US-0286324P . 

PR 29-MAY-2001; 2001US-0294108P . 

PR 09-JUL-2001; 2 001US-0303698P . 

PR 16-OCT-2001; 2001US-0098 1 151 . 
XX 

PA (CURA-) CURAGEN CORP. 
XX 

PI Edinger S, Gerlach V, Macdougall JR, Malyankar UM, Smithson G; 

PI Millet I, Peyman JA, Stone DJ, Gunther E, Ellerman K, Shimkets RA; 

PI Padigaru M, Guo X, Patturajan M, Taupier RJ, Burgess CE; 

PI Zerhusen BD, Kekuda R, Spytek KA, Gangolli EA, Fernandes ER; 

PI Gorman L; 

XX 

DR WPI; 2002-590434/63. 

DR N-PSDB; ABS59328. 
XX 

PT Cytoplasmic, nuclear, membrane bound and secreted polypeptides and 

PT nucleic acids encoding the polypeptides for diagnosing and treating e.g. 

PT cancer, Alzheimer's disease, cardiomyopathy, metabolic disease and 

PT diabetes. 

XX 

PS Claim 1; Page 50; 305pp; English. 
XX 

CC The present invention relates to new NOVX (NOVl-10) polypeptides. The 

CC molecules of the invention are useful for treating or preventing a NOVX- 

CC associated disorder, such as cardiomyopathy, atherosclerosis, or a 

CC disorder related to cell signal processing and metabolic pathway 

CC modulation in humans. NOVX polypeptides, nucleic acids and antibodies are 

CC useful for treating or preventing disorders or syndromes including breast 

CC cancer, Alzheimer's disease, epilepsy, Huntington's disease, anxiety, 

CC behavioural disorders, multiple sclerosis, myasthenia gravis, 

CC neurodegeneration, Parkinson's disease, pain, stroke, autoimmune disease, 

CC allergies, addiction, asthma, endometriosis, graft versus host disease, 

CC systemic lupus erythematosus, scleroderma, transplantation, psoriasis, 

CC Crohn's disease, HIV (human immunodeficiency virus) infection, 

CC atherosclerosis, cirrhosis, rheumatoid arthritis, diabetes, 

CC thrombocytopenia, bleeding disorders, metabolic disorders, obesity, 

CC glucose transport defect, glomerulonephritis, hypercalcaemia, polycystic 



CC kidney disease,, pancreatitis , renal tubular acidosis,, skin disorders, 

CC congenital diarrhoea, respiratory disease, gastro-intestinal diseases, 

CC muscle, bone, joint and skeletal disorders, haematopoietic disorders, 

CC urinary system disorders, osteoporosis, dental disease and infection, 

CC growth and reproductive disorders, hypogonadism, fertility, and/or other 

CC pathologies and disorders, viral, bacterial, or parasitic infections. The 

CC present amino acid sequence represents a NOVX protein of the invention 
XX 

SQ Sequence 325 AA; 

Query Match 59.6%; Score 1180; DB 5; Length 325; 

Best Local Similarity 80.0%; Pred. No. 8.5e-119; 

Matches 248; Conservative 11; Mismatches 31; Indels 20; Gaps 7; 



Qy 

Db 



1 MRANCSSSSACPANSSEEELPVGLEVHGNLELVFTWSTVMMGLLMFSLGCSVEIRKLWS 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I 

1 MRANCSSSSACPANSSEEELPVGLEVHGNLELVFTWSTIMMGLLMFSLGCSVEIRKLWS 60 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



61 HIRRPWGIAVGLLCQFGLMPFTAYLLAISFSLKPVQAIAVLIMGCCPGGTISNIFTFWVD 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I ! : II 

61 HI RRPWGI AVGLLCQFGLMP FTAYLLAI S FS LKPVQAIAVLTMGCCRG APSLTFSPS 117 

121 GDMDLSISMTTCSTVAALGMMPLCIYLYTWSWSLQQNLTIPYQNI GITLVCLTIPV 176 

I I : : II I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I II I I I I I I I 

118 GLMEIWIS GALGMMPLCIYLYTWSWSLQQNLTIPYQNIGLSLGITLVCLTIPV 170 



177 



236 



A F G VYVN Y RW PKQSKIILKI G AWG G VL L L WAVAG WLAK GSWNSDITLLTISFIFPLI 

I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I 

171 AFGVYVN YRWP K- SKULK- - AVVGGVLL L VVAVAGVVLAKGSWNS D ITLLTISFIFPLI 227 

237 GHWGFLLALFTHQSWQRCRTISLETGAQNIQMCITMLQLSFTAEHLVQMLSF-PLAYGL 295 

II I I I III I I I I I I I I I II:: I 11:1:: I : : I I I I I I 

228 GHVTGFLLALFTHQSWQ--RTLPIFLGLAFKTPCDTLLAMTSCPECSRLIYAFIPLLYGL 285 



Qy 



Db 



296 FQLIDGFLIV 305 

I I I I I I I I I I 

286 FQLIDGFLIV 295 



RESULT 4 
AAR77224 

ID AAR77224 standard; protein; 348 AA. 
XX 
AC 
XX 
DT 
XX 
DE 
XX 
KW 
KW 
XX 
OS 
XX 
PN 
XX 
PD 



AAR77224; 

17-DEC-1995 (first entry) 

Hamster ileal/renal bile acid cotransporter . 

Ileal/renal bile acid cotransporter; therapeutic; gene therapy; 
diagnostic . 

Cricetulus griseus. 

WO9517905-A1. 

06-JUL-1995. 



XX 

PF 29-DEC-1994; 94WO-US014431 . 
XX 

PR 29-DEC-1993; 93US- 0017 612 6 . 
XX 

PA (UYWA-) UNIV WAKE FOREST. 
XX 

PI Dawson PA; 
XX 

DR WPI; 1995-246189/32. 

DR N-PSDB; AAQ91108. 
XX 

PT Hamster and human ileal and bile acid transport DNA and protein - useful 

PT in treatment of e.g. hypercholesterolaemia, diabetes and various 

PT digestive diseases, and in gene therapy to restore bile acid uptake 

PT activity. 

XX 

PS Claim 34; Page 104-106; 148pp; English. 
XX 

CC The ileal/renal bile acid cotransporter protein is useful in the 

CC treatment of hypercholesterolaemia, diabetes, heart disease, liver 

CC disease and various digestive disorders. The cDNA may by used in gene 

CC therapy to restore bile acid uptake activity to patients whose ileum has 

CC been surgically resected for diseases such as Crohn disease, patients 

CC born with congenital defects in the bile transporter, and patients 

CC suffering from adult-onset chronic idiopathic bile acid diarrhoea. The 

CC DNA and protein may be used in screening methods as modulators of 

CC ileal/renal bile acid cotransport activity 

XX 

SQ Sequence 348 AA; 

Query Match 44.7%; Score 884; DB 2; Length 348; 

Best Local Similarity 46.9%; Pred. No. l.le-86; 

Matches 164; Conservative 74; Mismatches 102; Indels 10; Gaps 4; 

SSSACPANSS — EEELPVGLEVHGN — L ELVFT WS T VMMGL LMFS LGC S VE I RKLW SHI 62 
: I I I I : : I : : I : I I : I : I I : : : I : I I I : I I : I I : I I : 



I I I I I I || llllhll I : : I : : : I : I I I I : III I I I I I I I II I MINI 



I I I I : I I I I I I I : I I I I I I I I ::: I I I : I I I : I I : I I I I I I : I : I I 



I : : I I : : : I I I I I I I : : I : I : : : : I I I : I : : I : I I I : I : I : I I 



Qy 


7 


Db 


3 


Qy 


63 


Db 


63 


Qy 


123 


Db 


123 


Qy 


183 


Db 


183 


Qy 


243 


Db 


243 


Qy 


303 



I I I :: I I I I II I : I I : : I I I I : I I : MM I : I I 



II II: Mil: I : I : I I I 



Db 303 I LLGAYVAYKK CHGKNNTELQEKTDNEMEPRS S FQETNKGFQPDEK 348 



RESULT 5 
AAR77225 

ID AAR77225 standard; protein; 348 AA. 
XX 

AC AAR77225; 
XX 

DT 17-DEC-1995 (first entry) 
XX 

DE Human ileal/renal bile acid cotransporter . 
XX 

KW Ileal/renal bile acid cotransporter; therapeutic; gene therapy; 

KW diagnostic. 

XX 

OS Homo sapiens. 
XX 

PN WO9517905-A1. 
XX 

PD 06-JUL-1995. 
XX 

PF 29-DEC-1994; 94WO-US014431 . 
XX 

PR 29-DEC-1993; 93US-0017 6126 . 
XX 

PA (UYWA-) UNIV WAKE FOREST. 
XX 

PI Dawson PA; 
XX 

DR WPI; 1995-246189/32. 

DR N-PSDB; AAQ91109. 
XX 

PT Hamster and human ileal and bile acid transport DNA and protein - useful 

PT in treatment of e.g. hypercholesterolemia, diabetes and various 

PT digestive diseases, and in gene therapy to restore bile acid uptake 

PT activity. 

XX 

PS Claim 34; Page 111-114; 148pp; English. 
XX 

CC The ileal/renal bile acid cotransporter protein is useful in the 

CC treatment of hypercholes terolaemia, diabetes, heart disease, liver 

CC disease and various digestive disorders. The cDNA may by used in gene 

CC therapy to restore bile acid uptake activity to patients whose ileum has 

CC been surgically resected for diseases such as Crohn disease, patients 

CC born with congenital defects in the bile transporter, and patients 

CC suffering from adult-onset chronic idiopathic bile acid diarrhoea. The 

CC DNA and protein may be used in screening methods as modulators of 

CC ileal/renal bile acid cotransport activity 

XX 

SQ Sequence 34 8 AA; 

Query Match 43.5%; Score 860.5; DB 2; Length 348; 
Best Local Similarity 45.6%; Pred. No. 3.7e-84; 

Matches 160; Conservative 68; Mismatches 104; Indels 19; Gaps 4; 

Qy 5 CSSSSACPANSSEEELPVGLEVHGNLELVFTWSTVMMGLLMFSLGCSVEIRKLWSHIRR 64 



11:1 |: : I :| : I |::: I : I I I : I I : I I I : I Ihl 

Db 14 CS GAS C WP ESN FNN I — LS WLSTVLT I LLALVMFSMGCNVEI KKFLGHI KR 64 

Qy 65 PWGIAVGLLCQFGLMPFTAYLLAISFSLKPVQAIAVLIMGCCPGGTISNIFTFWVDGDMD 124 

I I I I I I I I I I I : I I I :: I ::: I : I : I I : I I I : I I II I I I I I I : I I I I I I I 

Db 65 PWGICVGFLCQFGIMPLTGFILSVAFDILPLQAVWLIIGCCPGGTASNILAYWVDGDMD 124 

Qy 125 LSISMTTCSTVAALGMMPLCIYLYTWSWSLQQNLTIPYQNIGITLVCLTIPVAFGVYVNY 184 

I I : I I I I I I I : I I I I I I I I : : I I I : : I I I I I I : I I I : I I : I : : I I : 

Db 125 LSVSMTTCSTLIALG^PLCLLIYTKMWDSGSIVIPYDNIGTSLVALWPVSIGMFVNH 184 

Qy 185 RWPKQSKIILKIGAWGGVLLLWAVAGWLAKGSWNSDITLLTISFIFPLIGHVTGFLL 244 

: I I : : : I I I I I I I : : I : I : : : : I I I : I : : I I I I I I : I : I I II 

Db 185 KWPQKAKI I LKI GS I AGAI LI VLI AWGGILYQSAWI I APKLWI I GTI FPVAGYS LGFLL 244 

Qy 245 ALFTHQSWQRCRTISLETGAQNIQMCITMLQLSFTAEHLVQMLSFPLAYGLFQLIDGFLI 304 

I I I I I I :: I I I I I I : I I :: I I I II I I : : I I I I : I I I : 

Db 245 ARIAGLPWYRCRTVAFETGMQNTQLCSTIVQLSFTPEELNWFTFPLIYSIFQLAFAAIF 304 

Qy 305 VAAYQTYKRRLKNKHGKKNSGCTEVCHTRKSTSSRETNAFLEVNEEGAITP 355 

: I I I : I I I : I I : : : I : I I I 

Db 305 LGFYVAYKK CHGKNKAEIPE SKENGTEPESSFYKAN--GGFQP 345 



RESULT 6 
AA019649 

ID AA019649 standard; protein; 348 AA. 
XX 

AC AA019649; 
XX 

DT 28-MAR-2003 (first entry) 
XX 

DE Human ileal sodium-dependent bile acid transporter protein. 
XX 

KW Human; ileal sodium-dependent bile acid transporter gene; SLC10A2; SNP; 

KW single nucleotide polymorphism; chromosome 13q33; cardiant; 

KW antiarteriosclerotic; antilipemic. 
XX 

OS Homo sapiens. 
XX 



FH 


Key 




Location/Qualifiers 














FT 


Misc- 


-dif f erence 


65 
















FT 






/ note= 


"optionally Leu 


depending 


on 


SNP 


present 


in 


gene" 


FT 


Misc- 


-difference 


98 
















FT 






/note= 


"optionally lie 


depending 


on 


SNP 


present 


in 


gene" 


FT 


Misc- 


-difference 


159 
















FT 






/note= 


"optionally lie 


depending 


on 


SNP 


present 


in 


gene" 


FT 


Misc- 


-difference 


290 
















FT 






/ note= 


"optionally Ser 


depending 


on 


SNP 


present 


in 


gene" 


FT 


Misc- 


-difference 


296 
















FT 






/ note" 


"optionally Leu 


depending 


on 


SNP 


present 


in 


gene" 


FT 


Misc- 


-difference 


316 
















FT 






/note= 


"optionally Glu 


depending 


on 


SNP 


present 


in 


gene" 



XX 

PN WO200283944-A2. 
XX 

PD 24-OCT-2002. 



PF ll-APR-2002; 2002WO-GB001681 . 
XX 

PR 17-APR-2001; 2001GB-00009296 . 

PR 19-APR-2001; 2001US-0284530P . 
XX 

PA (ASTR ) ASTRAZENECA AB . 

PA (ASTR ) ASTRAZENECA UK LTD. 

XX 

PI Morten JEN; 
XX 

DR WPI; 2003-046927/04. 
XX 

PT Diagnosing polymorphism in SLC10A2 in a human for assessing the 

PT pharmacogenetics of a drug for treating cardiovascular and hyperlipidemic 

PT conditions, by determining the status of the human by reference to 

PT polymorphism in SLC10A2. 

XX 

PS Claim 10; Page 21; 21pp; English. 
XX 

CC The present invention relates to a method of diagnosing polymorphisms in 

CC SLC10A2 (human ileal sodium-dependent bile acid transporter gene) in a 

CC human, which involves determining the status of the human by reference to 

CC polymorphisms in SLC10A2. The method is useful for assessing the 

CC pharmacogenetics of a drug acting at SLC10A2. The SLC10A2 gene 

CC polymorphism is useful as a genetic marker in a linkage study. SLC10A2 

CC drugs are also useful for treating cardiovascular (e.g. atherosclerosis) 

CC and hyperlipidemic conditions. The SLC10A2 gene is found at chromosome 

CC 13q33. The present sequence is the protein of the invention with 

CC associated alternative amino acids 

XX 

SQ Sequence 34 8 AA; 

Query Match 4 3.5%; Score 8 60.5; DB 6; Length 348; 
Best Local Similarity 45.6%; Pred. No. 3.7e-84; 

Matches 160; Conservative 68; Mismatches 104; Indels 19; Gaps 4; 

Qy 5 CSSSSACP7^SSEEELPVGLEV11GNLELVFTWSTVMMGLLMFSLGCSVEIRKLWSHIRR 64 

I I : I I : : I : I : I I : : : I : I I I : I I : I I I : I I I : I 

Db 14 CSGASCWPESNFNNI LSWLSTVLTI LIALVMFSMGCNVEI KKFLGHI KR 64 



Qy 65 PWGIAVGLLCQFGLMPFTAYLLAISFSLKPVQAIAVLIMGCCPGGTISNIFTFWVDGDMD 124 

I I I I II IMIhll I : : I : : : I : I : I I : I I I : I I I I I I I Ml : I I I I I I I 

Db 65 PWGICVGFLCQFGIMPLTGFILSVAFDILPLQAVWLIIGCCPGGTASNILAYWVDGDMD 124 

Qy 125 LSISMTTCSTVAALGMMPLCIYLYTWSWSLQQNLTIPYQNIGITLVCLTIPVAFGVYVNY 184 

I I : I I I I I I I : I I I I I I I I :: I I I : : I I I I I I : I I I : I I : I : : I I : 

Db 125 LSVSMTTCSTLL7UjGMMPLCLLIYTKMWDSGSIVIPYDNIGTSLVALVVPVSIGMFVNH 184 



Qy 185 RWPKQSKIILKIGAWGGVXLLWAVAGWIJ^GSWNSDITLLTISFIFPLIGHVTGFLL 244 

: I I : : : I I I I I I I : : I : I : : : : I I I : I : : I I I I I I : I : I I I I 

Db 185 KWPQKAKIILKIGSIAGAILIVLIAWGGILYQSAWIIAPKLWIIGTIFPVAGYSLGFLL 244 

Qy 245 ALFTHQSWQRCRTISLETGAQNIQMCITMLQLSFTAEHLVQMLSFPLAYGLFQLIDGFLI 304 

I I I I I I :: I I I I I I : I I :: I I I II I I : : I I I I : I I I .: 

Db 245 ARIAGLPWYRCRTVAFETGMQNTQLCSTIVQLSFTPEELNWFTFPLIYSIFQLAFAAIF 304 



Qy 305 VAAYQTYKRRLKNKHGKKNSGCTEVCHTRKSTSSRETNAFLEVNEEGAITP 355 

: I II: Ml : I I : ::| : I I I 

Db 305 LGFYVAYKK CHGKNKAEIPE SKENGTEPESSFYKAN — GGFQP 345 



RESULT 7 
ADD48705 

ID ADD48705 standard; protein; 362 AA. 
XX 

AC ADD48705; 
XX 

DT 29-JAN-2004 (first entry) 
XX 

DE Rat Protein P26435, SEQ ID NO 14414. 
XX 

KW Rat; pain; neuronal tissue; gene therapy; spinal segmental nerve injury; 

KW chronic constriction injury; CCI; spared nerve injury; SNI; Chung. 

XX 

OS Rattus norvegicus . 
XX 

PN WO2003016475-A2. 
XX 

PD 27-FEB-2003. 
XX 

PF 14-AUG-2002; 2002WO-US025765 . 
XX 

PR 14-AUG-2001; 2001US-0312147P . 

PR Ol-NOV-2001; 2001US-0346382P . 

PR 26-NOV-2001; 2001US-0333347P . 
XX 

PA (GEHO ) GEN HOSPITAL CORP. 

PA (FARB ) BAYER AG. 

XX 

PI Woolf C, D'urso D, Befort K, Costigan M; 
XX 

DR WPI; 2003-268312/26. 

DR GEN BANK; P26435. 
XX 

PT New composition comprising two or more isolated polypeptides, useful for 

PT preparing a medicament for treating pain in an animal. 

XX 

PS Claim 1; Page; 1017pp; English. 
XX 

CC The invention discloses a composition comprising two or more isolated rat 

CC or human polynucleotides or a polynucleotide which represents a fragment, 

CC derivative or allelic variation of the nucleic acid sequence. Also 

CC claimed are a vector comprising the novel polynucleotide, a host cell 

CC comprising the vector, a method for identifying a nucleotide sequence 

CC which is differentially regulated in an animal subjected to pain and a 

CC kit to perform the method, an array, a method for identifying an agent 

CC that increases or decreases the expression of the polynucleotide sequence 

CC that is differentially expressed in neuronal tissue of a first animal 

CC subjected to pain, a method for identifying a compound which regulates 

CC the expression of a polynucleotide sequence which is differentially 

CC expressed in an animal subjected to pain, a method for identifying a 

CC compound that regulates the activity of one or more of the 

CC polynucleotides, a method for producing a pharmaceutical composition, a 



CC method for identifying a compound or small molecule that regulates the 

CC activity in an animal of one or more of the polypeptides given in the 

CC specification, a method for identifying a compound useful in treating 

CC pain and a pharmaceutical composition comprising the one or more 

CC polypeptides or their antibodies. The polynucleotide or the compound that 

CC modulates its activity is useful for preparing a medicament for treating 

CC pain (e.g. spinal segmental nerve injury (Chung) , chronic constriction 

CC injury (CCI) and spared nerve injury (SNI)) in an animal (e.g. gene 

CC therapy) . The sequence presented is a rat protein (shown in Table 2 of 

CC the specification) which is differentially expressed during pain. Note: 

CC The sequence data for this patent did not form part of the printed 

CC specification, but was obtained in electronic form directly from WIPO at 

CC f tp . wipo . int/pub/published_pct_sequences . 

XX 

SQ Sequence 362 AA; 



Query Match 28.3%; Score 559.5; DB 7; Length 362; 

Best Local Similarity 37.2%; Pred. No. 1.6e-51; 

Matches 133; Conservative 69; Mismatches 135; Indels 21; Gaps 9; 

Qy 10 ACPANSSEEELPVGLEVHGNLELVFTWSTVMMGLLMFSLGCSVEIRKLWSHIRRPWGIA 69 

: I I I III I : : : : : I : I : I I I I I : : I I : : I : : I I : 
Db 7 SAPFNFS LPPGFG-HRATDKALSIILVLMLLLIMLSLGCTMEFSKIKAHLWKPKGVI 62 

Qy 70 VGLLCQFGLMPFTAYLLAISFSLKPVQAIAVLIMGCCPGGTISNIFTFWVDGDMDLSISM 129 

I I : I I I : I I I : I I II : : I : I : I I I I I I I : I I : I I : I I I : I I I I 
Db 63 VALVAQFGIMPLAAFLLGKIFHLSNIEALAILICGCSPGGNLSNLFTLAMKGDMNLSIVM 122 

Qy 130 TTCSTVAALGMMPLCIYLYT WSWSLQQNLTIPYQNIGITLVCLTIPVAFGVYVNYRW 18 6 

I I I I : : I I I I I I I : I : I : : I : : I I : I I : I I : I I I : : : 
Db 12 3 TTCSSFSALGMMPLLLYVYSKGIYDGDLKDK — VPYKGIMISLVIVLIPCTIGIVLKSKR 18 0 

Qy 187 PKQSKIILKIGAVVGGVLLLVVAVAGVVLAKGSWNSDIT — LLTISFIFPLIGHVTGFLL 244 

I | | | | : : : | : | | : | : | | | I : I I : I : : I 

Db 181 PHYVPYILKGGMIITFLLSVAVTALSVINVGNSIMFVMTPHLLATSSLMPFSGFLMGYIL 240 

Qy 245 -ALFTHQSWQRC-RTISLETGAQNIQMCITMLQLSFTAEHLVQMLSFPLAYGLFQLIDGF 302 

III I I I I I I : I I I I I I I : I I : I :: I I : : I I I I : I I I : I 

Db 241 SALF— QLNPSCRRTISMETGFQNIQLCSTILNVTFPPEVIGPLFFFPLLYMIFQLAEGL 298 

Qy 303 LIVAAYQTYKRRLKNKHGKKNSGCTEVCHTRKSTSSRETNAFLEVNEEGAITPGPPGP 360 

I I : : : I : : I I : : : : I I : I I I I I I 

Db 299 LIIIIFRCYEKI KPPKDQTKITYKAAATEDATPAALEKGTHNGNIPPLQPGP 350 



RESULT 8 
AAE37351 

ID AAE37351 standard; protein; 349 AA. 
XX 

AC AAE37351; 
XX 

DT 27-AUG-2003 (first entry) 
XX 

DE Human sodium/bile acid cotransporter , 8587 protein. 
XX 

KW Human; cardiovascular disorder; coronary artery disease; bradycardia; 
KW restenosis; cardiac hypertrophy; ischaemia reperfusion injury; angina; 



KW arteriosclerosis; coronary artery ligation; rheumatic heart disease; 

KW heart failure; hypertension; cardiomyopathy; myocardial infarction; 

KW arterial inflammation; microembolism; atherosclerosis; endocarditis; 

KW vascular heart disease; valvular disease; arrhythmia; gene therapy; 

KW sinus node dysfunction; sodium-bile acid cotransporter. 
XX 

OS Homo sapiens. 
XX 

PN WO2003039341-A2. 
XX 

PD 15-MAY-2003. 
XX 

PF 05-NOV-2002; 2 002WO-US035538 . 
XX 

PR 05-NOV-2001; 2001US-0339582P . 
XX 

PA (MILL- ) MILLENNIUM PHARM INC. 
XX 

PI Logan TJ, Chun M, Galvin KM; 
XX 

DR WPI; 2003-441437/41. 

DR N-PSDB; AAD56518. 
XX 

PT Treating a subject having a cardiovascular disorder,, e.g. angina, 

PT arrhythmia, or restenosis, comprises administering a 139, 258, 1261, 

PT 1486, 2398, 2414, 7660, 8587, 10183, 10550, 12680, 17921, 32248, 60489 or 

PT 93804 modulator. 

XX 

PS Disclosure; Page 109-110; 124pp; English. 
XX 

CC The invention relates to methods and compositions for treating a subject 

CC having a cardiovascular disorder using 139, 258, 1261, 1486, 2398, 2414, 

CC 7660, 8587, 10183, 10550, 12680, 17921, 32248, 60489 or 93804 modulator. 

CC The invention is useful for treating a cardiovascular disorder, including 

CC arteriosclerosis, atherosclerosis, vascular wall remodeling, restenosis, 

CC cardiac hypertrophy, ischaemia reperfusion injury, arterial inflammation, 

CC ventricular remodelling, rapid ventricular pacing, tachycardia, coronary 

CC microembolism, bradycardia, pressure overload, aortic bending, coronary 

CC artery ligation, vascular heart disease, valvular disease, including but 

CC not limited to, valvular degeneration caused by calcification, rheumatic 

CC heart disease, endocarditis, or complications of artificial valves; 

CC atrial fibrillation, long-QT syndrome, congestive heart failure, sinus 

CC node dysfunction, angina, heart failure, hypertension, atrial flutter, 

CC atrial fibrillation, pericardial disease, including but not limited to 

CC pericardial effusion and pericarditis, cardiomyopathies (e.g. dilated 

CC cardiomyopathy or idiopathic cardiomyopathy) , myocardial infarction, 

CC coronary artery disease, coronary artery spasm, ischaemic disease, 

CC arrhythmia, sudden cardiac death, and cardiovascular developmental 

CC disorders. The invention is also useful in gene therapy. The present 

CC sequence is human sodium/bile acid cotransporter protein. This sequence 

CC is used to illustrate the method of the invention 
XX 

SQ Sequence 34 9 AA; 



Query Match 27.9%; Score 553; DB 6; Length 349; 

Best Local Similarity 36.0%; Pred. No. 7.6e-51; 

Matches 124; Conservative 77; Mismatches 109; Indels 34; Gaps 10; 



Qy 31 ELVFTWSTVMMGLLMFSLGCSVEIRKLWSHIRRPWGIAVGLLCQFGLMPFTAYLLAISF 90 

:| :|: I: :| llll::l I : : I : : I I : I : I : I : I : I I I I : : I I 

Db 24 DIALSVILVFMLFFIMLSLGCTMEFSKIKAHLWKPKGIAIALVAQYGIMPLTAFVLGKVF 83 

Qy 91 SLKPVQAIAVLIMGCCPGGTISNIFTFWVDGDMDLSISMTTCSTVAALGMMPLCIYLYT- 149 

II : : I : I : I : I I III : I I : I : : I I I : I I I I I I I II I I I I I I I : I : I : 

Db 84 RLKNIEALAI LVCGCSPGGNLSNVFSLAMKGDMNLSIVMTTCSTFCALGMMPLLLYIYSR 14 3 

Qy 150 — WSWSLQQNLTIPYQNIGITLVCLTIPVAFGVYVNYRWPKQSKIILKIGAWGGVLLLV 2 07 

: I : : | | : | | : | | : | | | : : : I : : : : | | : : : I : 
Db 144 GI YDGDLKDK- -VP YKGIVI SLVLVLI PCTI GIVLKSKRPQYMRYVIKGGMI I ILL 197 

Qy 208 VAVAGWLAKGSWN S D I TLLTI S FI FPLI GH VTGFLL-ALFTHQSWQRC-RTI S 259 

: I I II: : I I : I : I I I : I : : I I I I I I I I : I 

Db 198 CSVAVTVLSAINVGKS IMFAMT PLLI ATS S LMPFI GFLLGYVLSALFCLNG- - RCRRTVS 255 

Qy 260 LETGAQNIQMCITMLQLSFTAEHLVQMLSFPLAYGLFQLIDGFLIVAAYQTYKRRLKNKH 319 

: I I I I I : I : I I : I : : I I : : I I I I : I I I : I I : : I : I : : I 
Db 256 METGCQNVQLCSTILNVAFPPEVIGPLFFFPLLYMIFQLGEGLLLIAIFWCYE-KFKTPK 314 

Qy 320 GKKNSGCTEVCHTRKSTSSRETNAFLEVNEEGAITPGPPGPMDC 363 

I | : : : | : | | I I : I II 

Db 315 DK TKMI YTAATT EETI PGALGNGT YKGEDC 344 



RESULT 9 
ADD48707 

ID ADD48707 standard; protein; 349 AA. 
XX 

AC ADD48707; 
XX 

DT 29-JAN-2004 (first entry) 
XX 

DE Human Protein Q14973, SEQ ID NO 14416. 
XX 

KW Human; pain; neuronal tissue; gene therapy; 

KW spinal segmental nerve injury; chronic constriction injury; CCI ; 

KW spared nerve injury; SNI; Chung. 

XX 

OS Homo sapiens. 
XX 

PN WO2003016475-A2 . 
XX 

PD 27-FEB-2003. 
XX 

PF 14-AUG-2002; 2002WO-US025765 . 
XX 

PR 14-AUG-2001; 2001US-0312147P . 

PR 01-NOV-2001; 2001US-034 6382P . 

PR 26-NOV-2001; 2001US-0333347P . 
XX 

PA (GEHO ) GEN HOSPITAL CORP. 

PA (FARB ) BAYER AG. 

XX 

PI Woolf C, D'urso D, Befort K, Costigan M; 
XX 



DR WPI; 2003-268312/26. 

DR GEN BANK; Q14973. 
XX 

PT New composition comprising two or more isolated polypeptides, useful for 

PT preparing a medicament for treating pain in an animal. 

XX 

PS Claim 1; Page; 1017pp; English. 
XX 

CC The invention discloses a composition comprising two or more isolated rat 

CC or human polynucleotides or a polynucleotide which represents a fragment, 

CC derivative or allelic variation of the nucleic acid sequence. Also 

CC claimed are a vector comprising the novel polynucleotide, a host cell 

CC comprising the vector, a method for identifying a nucleotide sequence 

CC which is differentially regulated in an animal subjected to pain and a 

CC kit to perform the method, an array, a method for identifying an agent 

CC that increases or decreases the expression of the polynucleotide sequence 

CC that is differentially expressed in neuronal tissue of a first animal 

CC subjected to pain, a method for identifying a compound which regulates 

CC the expression of a polynucleotide sequence which is differentially 

CC expressed in an animal subjected to pain, a method for identifying a 

CC compound that regulates the activity of one or more of the 

CC polynucleotides, a method for producing a pharmaceutical composition, a 

CC method for identifying a compound or small molecule that regulates the 

CC activity in an animal of one or more of the polypeptides given in the 

CC specification, a method for identifying a compound useful in treating 

CC pain and a pharmaceutical composition comprising the one or more 

CC polypeptides or their antibodies. The polynucleotide or the compound that 

CC modulates its activity is useful for preparing a medicament for treating 

CC pain (e.g. spinal segmental nerve injury (Chung), chronic constriction 

CC injury (CCI) and spared nerve injury (SNI)) in an animal (e.g. gene 

CC therapy) . The sequence presented is a human protein (shown in Table 2 of 

CC the specification) which is differentially expressed during pain. Note: 

CC The sequence data for this patent did not form part of the printed 

CC specification, but was obtained in electronic form directly from WIPO at 

CC ftp . wipo . int/pub/published_pct_sequences . 

XX 

SQ Sequence 349 AA; 

Query Match 27.9%; Score 553; DB 7; Length 349; 

Best Local Similarity 36.0%; Pred. No. 7.6e-51; 

Matches 124; Conservative 77; Mismatches 109; Indels 34; Gaps 10; 

VFTWSTVMMGLLMFSLGCSVEIRKLWSHIRRPWGIAVGLLCQFGLMPFTAYLLAISF 90 
: I : I : : I I I I I : : I I : : I : : I I : I : I : I : I : I I I I : : I I 



Qy 


31 


Db 


24 


Qy 


91 


Db 


84 


Qy 


150 


Db 


144 


Qy 


208 


Db 


198 



149 



|| ::|:|:|: II Ml :||:|: : I I I : I I I I I I I I I lllllll :|:| 



-WSWSLQQNLTIPYQNIGITLVCLTIPVAFGVYVNYRWPKQSKIILKIGAWGGVLLLV 207 
: I : : I I : I I : I I : I I I : : : I : : : : I I : : : I : 



kVAGWLAKG S WN S D I TLLTI SFI FPLIGHVTGFLL-ALFTHQSWQRC-RTI S 259 

II II: : I I : I : I I I : I : : I I I I I I I I : I 

1VAVTVLSAINVGKSIMFAMTPLLIATSSLMPFIGFLLGYVLSALFCLNG— RCRRTVS 255 



Qy 2 60 LETGAQNIQMCITMLQLSFTAEHLVQMLSFPLAYGLFQLIDGFLIVAAYQTYKRRLKNKH 319 

: I I I I I : I : I I : I : : I I : : I I I I : I I I : I I : : I : I : : I 

Db 256 METGCQNVQLCSTILNVAFPPEVIGPLFFFPLLYMIFQLGEGLLLIAIFWCYE-KFKTPK 314 

Qy 320 GKKNSGCTEVCHTRKSTSSRETNAFLEVNEEGAITPGPPGPMDC 363 

I | : : : | : | I I I : I II 

Db 315 DK TKMIYTAATT EETIPGALGNGTYKGEDC 344 



RESULT 10 
ABP75825 

ID ABP75825 standard; protein; 270 AA. 
XX 

AC ABP75825; 
XX 

DT 10-FEB-2003 (first entry) 
XX 

DE Human secretory polypeptide SPTM SEQ ID NO 1009. 
XX 

KW Human; SPTM; autoimmune disorder; inflammatory disorder; AIDS; anaemia; 

KW asthma; Crohn's disease; neurological disorder; epilepsy; cancer; 

KW Huntington's disease; Alzheimer's disease; Creutzfeldt- Jakob disease; 

KW multiple sclerosis; Parkinson's disease; cell proliferative disorder; 

KW anti-inflammatory; immunosuppressive; neuroprotective; nootropic; 

KW neuroleptic; anticonvulsant; cytostatic; antiparkinsonian; anxiolytic; 

KW antipsoriatic; antianaemic; anti-HIV; human immunodeficiency virus; 

KW secretory polynucleotide; secretory protein. 

XX 

OS Homo sapiens. 
XX 

PN WO200283876-A2. 
XX 

PD 24-OCT-2002. 
XX 

PF 27-MAR-2002; 2002WO-US009921 . 
XX 

PR 29-MAR-2001; 2001US-0280067P . 

PR 29-MAR-2001; 2001US-0280068P . 

PR 16-MAY-2001; 2001US-0291280P . 

PR 17-MAY-2001; 2001US-0291829P . 

PR 17-MAY-2001; 2 001US-029184 9P . 

PR 19-JUN-2001; 2001US-0299428P . 

PR 20-JUN-2001; 2001US-0299776P . 

PR 20-JUN-2001; 2001US-0300001P . 
XX 

PA (INCY-) INCYTE GENOMICS INC. 
XX 

PI Daffo A, Jones AL, Tran AB, Dahl CR, Gietzen D, Chinn J; 

PI Dufour GE, Hillman JL, Yu JY, Tuason O, Yap PE, Amshey SR; 

PI Daughtery SC, Dam TC, Liu TF, Nguyen DA, Kleefeld Y, Gerstin EH; 

PI Peralta CH, David MH, Lewis SA, Chen A J, Panzer SR, Harris B; 

PI Flores V, Marwaha R, Lo A, Lan RY, Urashka ME; 

XX 

DR WPI; 2003-075543/07. 

DR N-PSDB; ABZ36267. 
XX 



PT New human secretory proteins and polynucleotides, useful for diagnosing, 

PT treating or preventing autoimmune/inflammatory disorders (e.g. AIDS ) , 

PT neurological disorders (e.g. Alzheimer's), or cell proliferations or 

PT cancers. 
XX 

PS Claim 27; SEQ ID NO 1009; 458pp + Sequence Listing; English. 
XX 

CC The invention relates to a secretory polynucleotide (designated sptm) 

CC comprising any of 567 polynucleotide sequences (ABZ35837-ABZ36403) , a 

CC naturally occurring polynucleotide sequence at least 90 % identical to 

CC the polynucleotide sequence, a polynucleotide complementary to them or an 

CC RNA equivalent of them. The polypeptide or polynucleotide are useful for 

CC treating, preventing or diagnosing a disease or condition associated with 

CC the expression of functional SPTM. These are particularly useful for 

CC diagnosing, treating or preventing autoimmune/inflammatory disorders 

CC (e.g. acquired immunodeficiency syndrome, anaemia, asthma or Crohn's 

CC disease), neurological disorders (e.g. epilepsy, Huntington's disease, 

CC dementia, stroke, Alzheimer's disease, Creutzfeldt- Jakob disease, 

CC multiple sclerosis, cerebral palsy, Parkinson's disease, anxiety, 

CC schizophrenia or amnesia), or cell proliferative disorders (e.g. 

CC psoriasis, polycythemia vera, or cancers including adenocarcinoma, 

CC leukaemia, lymphoma, melanoma, myeloma, sarcoma or cancers of the brain, 

CC breast, cervix or prostate) . The present sequence is one of the SPTM 

CC proteins of the invention (ABP75384-ABP75962 ) . Note: The sequence data 

CC for this patent did not form part of the printed specf ication, but was 

CC obtained in electronic format directly from WIPO at 

CC f tp . wipo . int/pub/published_pct_sequences 

XX 

SQ Sequence 270 AA; 



Query Match 19.9%; Score 393.5; DB 6; Length 270; 

Best Local Similarity 34.2%; Pred. No. le-33; 

Matches 82; Conservative 47; Mismatches 76; Indels 35; Gaps 4; 

Qy 98 IAVLIMGCCPGGTISNIFTFWVDGDMDLSISMTTCSTVAALGMMPLCIYLYTWSW SL 154 

: I I I : I I I I I I M I : : I I I I I : I I I I I I I : I I : I I I I : : : I : I : I : 
Db 2 VAVLLCGCCPGGNLSNLMSLLVDGDMNLSIIMTISSTLLALVLMPLCLWIYSWAWINTPI 61 

Qy 155 QQNLTIPYQNIGITLVCLTIPVAFGVYVNYRWPKQSKIILKI GA 198 

I : I : : I I I I : I I : : I : : : : I : I : I 

Db 62 VQ— LLPLGTVTLTLCSTLIPIGLGVFIRYKYSRVADYIVKVSLWSLLVTLWLFIMTGT 119 

Qy 199 WGGVLLLWAVAGWLAKGSWNSDITLLTISFIFPLIGHVTGFLLALFTHQSWQRCRTI 258 

: : I I I : I I : I I I I : : I : I I I I I : 

Db 120 MLGPELLASI PAAVYVIA IFMPLAGYASGYGLATLFHLPPNCKRTV 165 

Qy 259 SLETGAQNIQMCITMLQLSFTAEHLVQMLSFPLAYGLFQLIDGFLIVAAYQTYKRRLKNK 318 

I II 1:1 1:1: I :|:|:| : : I III I III : : I hi : :| 

Db 166 CLETGSQNVQLCTAILKLAFPPQFIGSMYMFPLLYALFQSAEAGIFVLIYKMYGSEMLHK 225 



RESULT 11 
AAE13283 

ID AAE13283 standard; protein; 491 AA. 
XX 

AC AAE13283; 
XX 



DT 12-FEB-2002 (first entry) 
XX 

DE Human transporters and ion channels (TRICH)-IO. 
XX 

KW Human; transporter and ion channel; TRICH; akinesia; cystic fibrosis; 

KW diabetes mellitus; Parkinson 1 s disease; myasthenia gravis; dementia; 

KW cardiac disorder; angina; hypertension; myocarditis; hyperglycaemia ; 

KW neurological disorder; Alzheimer's disease; cataract; infertility; 

KW Wilson's disease; schizophrenia; Grave's disease; addison's disease; 

KW Huntington's disease; multiple sclerosis; meningitis; hypotensive; 

KW cardiant; nootropic; neuroprotective; neuroleptic; ophthalmological ; 

KW antithyroid; anticonvulsant; goitre; antiinflammatory. 
XX 

OS Homo sapiens. 
XX 

FH Key Location/Qualifiers 

FT Domain 241. .261 

FT /label= Transmembrane_domain 

FT Domain 251. .439 

FT /note= "Sodium, acid and bile transporter domain" 

FT Domain 288. .307 

FT /label= Transmembrane_domain 

FT Domain 325. .343 

FT /label- Transmembrane_domain 

FT Domain 416. .435 

FT /label= Transmembrane_domain 

XX 

PN WO200177174-A2. 
XX 

PD 18-OCT-2001. 
XX 

PF 06-APR-2001; 2001WO-US011206 . 
XX 

PR 06-APR-2000; 2 000US-0195595P . 

PR 12-APR-2000; 2000US-0196872P . 

PR 20-APR-2000; 2000US-0199020P . 

PR 28-APR-2000; 2000US-0200552P . 

PR 05-MAY-2000; 2000US-0202348P . 

PR ll-MAY-2000; 2000US-02034 95P . 
XX 

PA (INCY-) INCYTE GENOMICS INC. 
XX 

PI Reddy R, Thornton M, Borowsky ML, Tang YT, Khan FA, Tribouley CM; 

PI Gandhi AR, Yao MG, Sanjanwala MS, Baughn MR, Nguyen DB, Policky JL; 

PI Yue H, Seilhamer JJ, Walia NK, Lai P, Kearney L, Walsh RT, Lu DAM; 

PI Lu Y, Greene BD, Raumann BE, Patterson C; 
XX 

DR WPI; 2002-017448/02. 

DR N-PSDB; AAD22002. 
XX 

PT Polypeptides of human transporters and ion channels, useful for 

PT diagnosing, treating or preventing disorders of transport, neurological, 

PT muscle, immunological and cell proliferative disorders. 

XX 

PS Claim 1; Page 130-131; 150pp; English. 
XX 

CC The invention relates to human transporters and ion channels (TRICH) and 



CC the polynucleotides encoding them. The composition comprising TRICH or 

CC agonist of TRICH is useful for treating a disease or condition associated 

CC with decreased expression of functional TRICH or condition associated 

CC with overexpression of TRICH respectively. The composition comprising Ab 

CC is useful for diagnosing a condition of disease associated with 

CC expression of TRICH in a subject, where the disorders include a transport 

CC disorder such as akinesia, cystic fibrosis, diabetes mellitus, 

CC Parkinson ! s disease, myasthenia gravis, cardiac disorders associated with 

CC transport e.g. angina, hypertension, myocarditis, neurological disorders 

CC associated with transport e.g. Alzheimer's disease, Wilson's disease, 

CC schizophrenia, cataracts, infertility, hyperglycaemia, Grave's disease, 

CC goitre, addison's disease, Huntington's disease, dementia, multiple 

CC sclerosis, bacterial and viral meningitis. TRICH DNA is useful for 

CC generating a transcript image of a tissue or cell type, which represents 

CC the global pattern of gene expression by a particular tissue or cell type 

CC and for analysing the proteome of a tissue or cell type. TRICH DNA is 

CC used in gene therapy. The present amino acid sequence is human TRICH10 

CC protein 

XX 

SQ Sequence 491 AA; 

Query Match 19.5%; Score 386.5; DB 5; Length 491; 

Best Local Similarity 27.1%; Pred. No. 1.3e-32; 

Matches 95; Conservative 56; Mismatches 105; Indels 95; Gaps 7; 

Qy 44 LLMFSLGCSVEIRKLWSHIRRPWGIAVGLLCQFGLMPFTAYLLAISFSLKPVQA--IAVL 101 

: I I I I : I : : : I : I i I II : I I I I : : : I 

Db 115 ITMLGLGCTVDVNHFGAHVRRP VAALLAALPVRPPAAAGLPAGPRLQAGRGGRRGLL 171 

Qy 102 IMGCCPGGTI SNI FTFWVDGDMDL 125 

: I I I II I : I I : : I I I I I : I 

Db 172 LCGCCPGGNLSNLMSLLVDGDMNLRRAALLALSSDVGSAQTSTPGLAVSPFHLYSTYKKK 231 

Qy 126 S I SMTTCSTVAALGMMPLCI YLYTWSW SLQQNLTIPYQ 163 

I I I I I I : I I : I I I I : : : I : I : I : I : I 

Db 232 VSWLFDSKLVLISAHSLFCSIIMTISSTLLALVLMPLCLWIYSWAWINTPIVQ— LLPLG 28 9 

Qy 164 NIGITLVCLTIPVAFGVYVNYRWPKQSKIILKI GAWGGVLLLV 207 

: : I I | | : | | : : | : : : : | : | : I : : I I I 

Db 2 90 TWLTLCSTLIPIGLGVFIRYKYSRVADYIVKVSLWSLLVTLWLFIMTGTMLGPELLAS 349 

Qy 2 08 VAVAGWI^GSWNSDITLLTISFIFPLIGHVTGFLLALFTHQSWQRCRTISLETGAQNI 2 67 

: I I : I | | | : : | : | | | I I : I I I I : I I : 

Db 350 IPAAVYVIA IFMPLAGYASGYGLATLFHLPPNCKRTVCLETGSQNV 395 

Qy 2 68 QMCITMLQLSFTAEHLVQMLSFPLAYGLFQLIDGFLIVAAYQTYKRRLKNK 318 

I : I : I : I : I : : I I I I I I I I : : I hi : : I 

Db 396 QLCTAILKLAFPPQFIGSMYMFPLLYALFQSAEAGIFVLIYKMYGSEMLHK 446 



RESULT 12 
ABP43962 

ID ABP43962 standard; protein; 491 AA. 
XX 

AC ABP43962; 
XX 

DT 26-FEB-2003 (first entry) 



XX 

DE clone IMAGE : 3502 817 . 
XX 

KW Neuroprotective; immunomodulator ; cancer; cytostatic; anti-inflammatory; 

KW gene therapy; nutritional supplement; wound; burn; ulcer; 

KW Alzheimer's disease; Huntington's disease; amyotrophic lateral sclerosis; 

KW autoimmune disorder; inflammation; vulnerary. 

XX 

OS Homo sapiens . 
XX 

PN WO200231111-A2. 
XX 

PD 18-APR-2002. 
XX 

PF ll-OCT-2001; 2 001WO-US027760 . 
XX 

PR 12-OCT-2000; 2000US-00687527 . 
XX 

PA (HYSE-) HYSEQ INC. 
XX 

PI Tang YT, Liu C, Zhou P, Asundi V, Zhang J, Zhao QA, Ren F; 

PI Xue AJ, Yang Y, Wehrman T, Drmanac RT; 

XX 

DR WPI; 2002-426278/45. 

DR N-PSDB; ABQ61206. 
XX 

PT New polypeptides and their encoded proteins, useful as nutritional 

PT sources or supplements, or in gene therapy, particularly for treating 

PT wounds, Alzheimer's disease, amyotrophic lateral sclerosis, cancer or 

PT inflammation. 
XX 

PS Claim 20; SEQ ID # 865; 357pp + Sequence Listing; English. 
XX 

CC The invention relates to 446 newly isolated polynucleotide sequences. The 

CC activity of polynucleotides of the invention may be described as, 

CC vulnerary, neuroprotective, immunomodulator, cytostatic and anti- 

CC inflammatory. Compositions comprising nucleic acids of the invention are 

CC useful for treating a mammalian subject, or as nutritional sources or 

CC supplements. These are useful in gene therapy, particularly for treating 

CC wounds, burns or ulcers, Alzheimer's disease, Huntington's disease, 

CC amyotrophic lateral sclerosis, autoimmune disorders, cancer or 

CC inflammation. The nucleic acids and polypeptides are also useful in 

CC diagnostic and research methods. The sequences given in records ABP43544- 

CC ABP43989 represent polypeptides encoded by polynucleotides of the 

CC invention. NOTE: The sequence data for this patent did not form part of 

CC the printed specification, but was obtained in electronic format directly 

CC from WIPO at ftp.wipo.int/pub/published_pct_sequences 

XX 

SQ Sequence 491 AA; 

Query Match 19.3%; Score 381; DB 5; Length 491; 
Best Local Similarity 26.6%; Pred. No. 5.2e-32; 

Matches 98; Conservative 57; Mismatches 113; Indels 100; Gaps 8; 

Qy 27 HGNLELVFTWSTVMMGLLMFSLGCSVEI RKLWSHI RRPWGIAVGLLCQFGLMPFTAYLL 86 

II I : I : I I I I : I : : : I : I I I II : I I I 

Db 103 HGLNVFVGAALCITMLG LGCTVDVNHFGAHVRRP VAALLAALPVRPPAAAGL 154 



Qy 87 AISFSLKPVQA— IAVLIMGCCPGGTI SNIFTFWVDGDMDL 125 

I : : : I : I I I I I I : I I : : I I I I I : I 

Db 155 PAGPRLQAGRGGRRGLLLCGCCPGGNLSNLMSLLVDGDMNLRRAALLALSSDVGSAQTST 214 

Qy 126 SISMTTCSTVAALGMMPLCIYLYT 149 

II II ||: II :||||:::|: 
Db 215 PGLAVSPFHLYSTYKKKVSWLFDSKLVLISAHSLFCSIIMTISSTLLALVLMPLCLWIYS 274 

Qy 150 WSW SLQQNLTIPYQNIGITLVCLTIPVAFGVYVNYRWPKQSKIILKI 196 

I : I : I : I : : I I I I : I I : : I : : : : I : I : 

Db 275 WAWINTPIVQ--LLPLGTVTLTLCSTLIPIGLGVFIRYKYSRVADYIVKVSLWSLLVTLV 332 

Qy 197 GAWGGVLLLWAVAGWLAKGSWNSDITLLTISFIFPLIGHWGFLLALFTHQ 250 

I : : I I I : I I : I I I : : I : I I I 

Db 333 VLFIMTGTMLGPELLASIPAAVYVIA 1 FM P LAAYAS G YGLAT L FH L 378 

Qy 251 SWQRCRTISLETGAQNIQMCITMLQLSFTAEHLVQMLSFPLAYGLFQLIDGFLIVAAYQT 310 

I I : I I II : I I : I : I : I : I : I : : I I I I I I I I : : I I : 

Db 379 PPNCKRTVCLETGSQNVQLCTAILKLAFPPQFIGSMYMFPLLYALFQSAEAGIFVLIYKM 438 

Qy 311 YKRRLKNK 318 

I : : I 

Db 439 YGSEMLHK 446 



RESULT 13 
ABU69595 

ID ABU69595 standard; protein; 490 AA. 
XX 

AC ABU69595; 
XX 

DT 05-JUN-2003 (first entry) ' 
XX 

DE Human NF~kappaB associated polypeptide sequence #1. 
XX 

KW Human; nuclear f actor-kappaB; NF-kappaB; immune disorder; cancer; 

KW inflammatory disorder; apoptosis; hepatic disorder; Hodgkin's lymphoma; 

KW haematopoietic tumour; hyper-IgM syndrome; viral infection; asthma; 

KW hypohidrotic ectodermal dysplasia; human immunodeficiency virus; HIV; 

KW X-linked anhidrotic ectodermal dysplasia; al incontinentia pigmenti; 

KW influenza; rheumatoid arthritis; inflammatory bowel disease; colitis; 

KW atherosclerosis; cachexia; euthyroid sick syndrome; stroke; EAE; 

KW experimental allergic encephalomyelitis; autoimmune disorder; wound; 

KW hyper immune activity; acute phase response; hypercongenital condition; 

KW birth defect; necrotic lesion; organ transplant rejection; pancreas; 

KW signal transduction; hyperprolif erative disorder; diabetes mellitus; 

KW vitamin B12 malabsorption; neurological disorder; Huntington's chorea; 

KW Turner's syndrome; bacterial infection; cardiovascular disorder; 

KW infertility; psoriasis; haemolytic anaemia; antiinflammatory; anti-HIV; 

KW cytostatic; hepatotropic; virucide; antirheumatic; antiarthritic; 

KW antiasthmatic; immunomodulator ; antidiabetic; antiallergic; 

KW neuroprotective; immunosuppressive; vulnerary; antibacterial; 

KW antiinf ertility; antianaemic; antipsoriatic; cerebroprotective; cardiant; 

KW antiarteriosclerotic. 

XX 

OS Homo sapiens . 



PN WO200286076-A2. 
XX 

PD 31-OCT-2002. 
XX 

PF 19-APR-2002; 2002WO-US012636 . 
XX 

PR 19-APR-2001; 2001US-0284962P . 

PR 26-APR-2001; 2001US-0286645P . 

PR 09-JAN-2002; 2002US-0346986P . 
XX 

PA (BRIM ) BRISTOL-MYERS SQUIBB CO. 
XX 

PI Carman J, Feder J, Nadler S; 
XX 

DR WPI; 2003-093119/08. 

DR N-PSDB; ACA54634. 
XX 

PT Novel NF-kappaB-associated polypeptides and polynucleotides useful for 

PT diagnosing, treating and preventing cancer, hepatic disorders, aberrant 

PT apoptosis, viral infections, autoimmune disorders, asthma and stroke. 
XX 

PS Claim 6; Page 488-489; 608pp; English. 
XX 

CC The present invention relates to the isolation of human nuclear factor- 

CC kappaB (NF-kappaB) associated polypeptides and polynucleotides. The NF- 

CC kappaB associated polypeptide and polynucleotide sequences are useful for 

CC preventing, treating or ameliorating various disorders including immune 

CC disorders, inflammatory disorders, cancers, disorders relating to 

CC aberrant apoptosis, hepatic disorders, Hodgkin's lymphomas, 

CC haematopoietic tumours, hyper-IgM syndromes, hypohidrotic ectodermal 

CC dysplasia, X-linked anhidrotic ectodermal dysplasia, immunodeficiency, al 

CC incontinentia pigment!, viral infections (e.g. those caused by human 

CC immunodeficiency virus (HIV), human T-cell lymphotropic virus (HTLV) , 

CC hepatitis B, hepatitis C, Epstein Barr virus (EBV) , influenza), 

CC rheumatoid arthritis, inflammatory bowel disease, colitis, asthma, 

CC atherosclerosis, cachexia, euthyroid sick syndrome, stroke, experimental 

CC allergic encephalomyelitis (EAE) , autoimmune disorders, disorders related 

CC to hyper immune activity, disorders related to aberrant acute phase 

CC responses, hypercongenital conditions, birth defects, necrotic lesions, 

CC wounds, organ transplant rejection, disorders related to aberrant signal 

CC transduction, hyperprolif erative disorders, diseases of the pancreas 

CC (e.g. diabetes mellitus, vitamin B12 malabsorption), neurological 

CC disorders (e.g. Huntington's chorea), Turner's syndrome, bacterial 

CC infections, cardiovascular disorders, infertility, psoriasis and 

CC haemolytic anaemia. The present sequence represents a human NF-kappaB 

CC associated polypeptide of the invention 

XX 

SQ Sequence 490 AA; 

Query Match 19.2%; Score 380.5; DB 6; Length 490; 
Best Local Similarity 26.8%; Pred. No. 5.8e-32; 

Matches 94; Conservative 56; Mismatches 106; Indels 95; Gaps 7 

Qy 44 LLMFSLGCSVEIRKLWSHIRRPWGIAVGLLCQFGLMPFTAYLLAISFSLKPVQA — IAVL 101 

: I I I I : I : : : I : I I I II : I I I I : : : I 

Db 114 I TMLGLGCTVDVNHFGAHVRRP- ; — VAALLAAL PVRP PAAAGLPAGPRLQAGRGGRRGLL 170 



Qy 102 IMGCCPGGTI SNI FTFWVDGDMDL 125 

: I I I I I I : II : : I I I I I : I 

Db 171 LCGCCPGGNLSNLMSLLVDGDMNLRRAALLALSSDVGSAQTSTPGLAVSPFHLYSTYKKK 230 

Qy 126 SISMTTCSTVAALGMMPLCIYLYTWSW SLQQNLTIPYQ 163 

I I I I I I : I I : I I I I : : : I : I : I : I : I 

Db 231 VSWLFDSKLVLISAHSLFCSIIMTISSTLLALVLMPLCLWIYSWAWINTPIVQ — LLPLG 288 

Qy 164 NIGITLVCLTIPVAFGVYVNYRWPKQSKIILKI GAWGGVLLLV 207 

: : II | | : | | : : | : : : : | : | : I : : I I I 

Db 289 TWLTLCSTLIPIGLGVFIRYKYSRVADYIVKVSLWSLLVTLWLFIMTGTMLGPELLAS 348 

Qy 208 VAVAGWLAKGSWNSDITLLTISFIFPLIGHVTGFLLALFTHQSWQRCRTISLETGAQNI 267 

: I I : I | | : : | : | | | I I : I I I I : I I : 

Db 349 IPAAVYVIA IFMPLAAYASGYGLATLFHLPPNCKRTVCLETGSQNV 394 

Qy 2 68 QMCITMLQLSFTAEHLVQMLSFPLAYGLFQLIDGFLIVAAYQTYKRRLKNK 318 

I : I : I : I : I : : I I I I I I I I : : I I : I : : I 

Db 395 QLCTAILKLAFPPQFIGSMYMFPLLY7VLFQSAEAGIFVLIYKMYGSEMLHK 445 



RESULT 14 
ABU69621 

ID ABU69621 standard; protein; 490 AA. 
XX 

AC ABU69621; 
XX 

DT 05-JUN-2003 (first entry) 
XX 

DE Human NF-kappaB associated polypeptide sequence #24. 
XX 

KW Human; nuclear f actor-kappaB; NF-kappaB; immune disorder; cancer; 

KW inflammatory disorder; apoptosis; hepatic disorder; Hodgkin's lymphoma; 

KW haematopoietic tumour; hyper-IgM syndrome; viral infection; asthma; 

KW hypohidrotic ectodermal dysplasia; human immunodeficiency virus; HIV; 

KW X-linked anhidrotic ectodermal dysplasia; al incontinentia pigmenti; 

KW influenza; rheumatoid arthritis; inflammatory bowel disease; colitis; 

KW atherosclerosis; cachexia; euthyroid sick syndrome; stroke; EAE; 

KW experimental allergic encephalomyelitis; autoimmune disorder; wound; 

KW hyper immune activity; acute phase response; hypercongenital condition; 

KW birth defect; necrotic lesion; organ transplant rejection; pancreas; 

KW signal transduction; hyperprolif erative disorder; diabetes mellitus; 

KW vitamin B12 malabsorption; neurological disorder; Huntington f s chorea; 

KW Turner's syndrome; bacterial infection; cardiovascular disorder; 

KW infertility; psoriasis; haemolytic anaemia; antiinflammatory; anti-HIV; 

KW cytostatic; hepatotropic; virucide; antirheumatic; antiarthritic; 

KW antiasthmatic; immunomodulator ; antidiabetic; antiallergic; 

KW neuroprotective; immunosuppressive; vulnerary; antibacterial; 

KW antiinf ertility; antianaemic; antipsoriatic; cerebroprotective; cardiant; 

KW antiarteriosclerotic . 

XX 

OS Homo sapiens. 
XX 

PN WO200286076-A2. 
XX 

PD 31-OCT-2002. 



XX 

PF 19-APR-2002; 2002WO-US012636 . 
XX 

PR 19-APR-2001; 2001US-0284962P . 

PR 26-APR-2001; 2001US-0286645P . 

PR 09-JAN-2002; 2002US-0346986P . 
XX 

PA (BRIM ) BRISTOL-MYERS SQUIBB CO. 
XX 

PI Carman J, Feder J, Nadler S; 
XX 

DR WPI; 2003-093119/08. 

DR N-PSDB; ACA54634. 
XX 

PT Novel NF-kappaB-associated polypeptides and polynucleotides useful for 

PT diagnosing, treating and preventing cancer, hepatic disorders, aberrant 

PT apoptosis, viral infections, autoimmune disorders, asthma and stroke. 
XX 

PS Claim 4; Page 489-490; 608pp; English. , 
XX 

CC The present invention relates to the isolation of human nuclear factor- 

CC kappaB (NF-kappaB) associated polypeptides and polynucleotides. The NF- 

CC kappaB associated polypeptide and polynucleotide sequences are useful for 

CC preventing, treating or ameliorating various disorders including immune 

CC disorders, inflammatory disorders, cancers, disorders relating to 

CC aberrant apoptosis, hepatic disorders, Hodgkin's lymphomas, 

CC haematopoietic tumours, hyper-IgM syndromes, hypohidrotic ectodermal 

CC dysplasia, X-linked anhidrotic ectodermal dysplasia, immunodeficiency, al 

CC incontinentia pigmenti, viral infections (e.g. those caused by human 

CC immunodeficiency virus (HIV) , human T-cell lymphotropic virus (HTLV) , 

CC hepatitis B, hepatitis C, Epstein Barr virus (EBV) , influenza), 

CC rheumatoid arthritis, inflammatory bowel disease, colitis, asthma, 

CC atherosclerosis, cachexia, euthyroid sick syndrome, stroke, experimental 

CC allergic encephalomyelitis (EAE) , autoimmune disorders, disorders related 

CC to hyper immune activity, disorders related to aberrant acute phase 

CC responses, hypercongenital conditions, birth defects, necrotic lesions, 

CC wounds, organ transplant rejection, disorders related to aberrant signal 

CC transduction, hyperprolif erative disorders, diseases of the pancreas 

CC (e.g. diabetes mellitus, vitamin B12 malabsorption), neurological 

CC disorders (e.g. Huntington's chorea), Turner's syndrome, bacterial 

CC infections, cardiovascular disorders, infertility, psoriasis and 

CC haemolytic anaemia. The present sequence represents a human NF-kappaB 

CC associated polypeptide of the invention 

XX 

SQ Sequence 4 90 AA; 



Query Match 19.2%; Score 380.5; DB 6; Length 490; 

Best Local Similarity 26.8%; Pred. No. 5.8e-32; 

Matches 94; Conservative 56; Mismatches 106; Indels 95; Gaps 7; 

Qy 44 LLMFSLGCSVEIRKLWSHIRRPWGIAVGLLCQFGLMPFTAYLLAISFSLKPVQA — IAVL 101 

: I I I I : I : : : I : I I I II : I I I I : : : I 

Db 114 ITMLGLGCTVDVNHFGAHVRRP VAALLAALPVRPPAAAGLPAGPRLQAGRGGRRGLL 170 

Qy 102 IMGCCPGGTI SNI FTFWVDGDMDL 125 

: I I I I I I : I I : : I I I I I : I 

Db 171 LCGCCPGGNLSNLMSLLVDGDMNLRRT^ALLALSSDVGSAQTSTPGLAVSPFHLYSTYKKK 230 



Qy 126 SISMTTCSTVAALGMMPLCIYLYTWSW SLQQNLTIPYQ 163 

I I I I I I : II : I I I I : : : I : I : I : I : I 

Db 231 VSWLFDSKLVLISAHSLFCSIIMTISSTLLALVLMPLCLWIYSWAWINTPIVQ — LLPLG 288 

Qy 164 NIGITLVCLTIPVAFGVYVNYRWPKQSKIILKI GAWGGVLLLV 207 

: : I I II: I I : : I : : : : I : I : I : : I I I 

Db 289 TVTLTLCSTLIPIGLGVFIRYKYSRVADYIVKVSLWSLLVTLWLFIMTGTMLGPELLAS 348 

Qy 208 VAVAGWLAKGSWNSDITLLTISFIFPLIGHVTGFLLALFTHQSWQRCRTISLETGAQNI 267 

: I I : I I I : : I : I I I I I : I I I I : I I : 

Db 349 IPAAVYVIA IFMPLAAYASGYGLATLFHLPPNCKRTVCLETGSQNV 394 

Qy 2 68 QMCITMLQLSFTAEHLVQMLSFPLAYGLFQLIDGFLIVAAYQTYKRRLKNK 318 

1:1 : I : I : I : : I I I I I I I I : : I I : I : : I 

Db 395 QLCTAILKLAFPPQFIGSMYMFPLLYALFQSAEAGIFVLIYKMYGSEMLHK 445 



RESULT 15 
AAE21252 



ID AAE21252 standard; protein; 225 AA. 
XX 

AC AAE21252; 
XX 

DT 01-JUL-2002 (first entry) 
XX 

DE Human gene 8 encoded secreted protein fragment, SEQ ID NO: 117. 
XX 

KW Human; secreted protein; immune disorder; antiallergic; antirheumatic; 

KW rheumatoid arthritis; breast neoplasia; breast cancer; antiarthritic; 

KW neurological disease; Alzheimer's disease; Parkinson's disease; trauma; 

KW Tourette syndrome; encephalitis; cytostatic; haemostatic; anaemia; mania ; 

KW antiinflammatory; ophthalmalogical ; dermatological ; immunostimulatory; 

KW immunomodulatory; immunosuppressive; antibacterial; antipsoriatic; 

KW gene therapy; autoimmune disease; Huntington's disease; meningitis; 

KW demyelinating disease; peripheral neuropathy; congenital malformation; 

KW spinal cord injury; peripheral neuropathy; ischaemia; perception; 

KW multiple sclerosis; infarction; haemorrhage; schizophrenia; dementia; 

KW depression; panic disorder; learning disability; ALS; feeding disorder; 

KW hyperprolif erative disorder; sleep pattern; cardiovascular disorder; 

KW reproductive disorder; digestive system disorder; behavioural disorder. 
XX 

OS Homo sapiens . 
XX 

FH Key Location/Qualifiers 

FT Misc-dif ference 200 

FT /label= Unknown 

FT /note= "Xaa equals any of the naturally occurring L-amino 

FT acids" 

FT Misc-dif ference 204 

FT /label= Unknown 

FT /note== "Xaa equals any of the naturally occurring L-amino 

FT acids" 

FT Misc-dif ference 206 

FT /label= Unknown 

FT /note= "Xaa equals any of the naturally occurring L-amino 

FT acids" 



FT Misc-dif ference 210 

FT /label= Unknown 

FT /note= "Xaa equals any of the naturally occurring L-amino 

FT acids" 

FT Misc-dif ference 214 

FT /label= Unknown 

FT /note= "Xaa equals any of the naturally occurring L-amino 

FT acids" 

FT Misc-dif ference 217 

FT /label= Unknown 

FT /note= "Xaa equals any of the naturally occurring L-amino 

FT acids" 

FT Misc-dif ference 218 

FT /label= Unknown 

FT /note= "Xaa equals any of the naturally occurring L-amino 

FT acids" 

FT Misc-dif ference 222 

FT /label= Unknown 

FT /note= "Xaa equals any of the naturally occurring L-amino 

FT acids" 

XX 

PN WO200216390-A1. 
XX 

PD 28-FEB-2002. 
XX 

PF 17-JAN-2001; 2001WO-US001435 . 
XX 

PR 18-AUG-2000; 2000US-0226282P . 
XX 

PA (HUMA-) HUMAN GENOME SCI INC. 
XX 

PI Rosen CA, Komatsoulis GA, Baker KP, Birse CE, Soppet DR; 

PI Olsen HS, Moore PA, Wei P, Ebner R, Duan DR, Shi Y, Choi GH; 

PI Fiscella M, Ni J; 

XX 

DR WPI; 2002-304113/34. 
XX 

PT An isolated nucleic acid molecule (I) comprising a polynucleotide which 

PT encodes a polypeptide useful in the diagnosis and treatment of disorders 

PT e.g. immune disorders. 
XX 

PS Disclosure; Page 26; 504pp; English. 
XX 

CC AAD33692-AAD33736 represent cDNAs corresponding to 21 human secreted 

CC protein genes, and AAE21191-AAE21235 represent the proteins they encode. 

CC AAE21236-AAE21280 represent human secreted protein fragments. The genes 

CC and their corresponding secreted proteins are useful for preventing, 

CC treating or ameliorating medical conditions, e.g., by protein or gene 

CC therapy. Pathological conditions can be diagnosed by determining the 

CC amount of the new protein in a sample or by determining the presence of 

CC mutations in the new genes. Specific uses are described for each of the 

CC 21 genes, based on the tissues in which they are most highly expressed, 

CC and include developing products for the diagnosis or treatment of immune 

CC or autoimmune diseases e.g. AIDS (acquired immune deficiency syndrome), 

CC asthma, anaemia and rheumatoid arthritis, breast neoplasia and breast 

CC cancer, neurological diseases e.g. Alzheimer's disease, Parkinson's 

CC disease, Huntington's disease, Tourette syndrome, meningitis, 



CC demyelinating disease,, peripheral neuropathies, neoplasia, trauma, 

CC congenital malformations, spinal cord injuries, toxic neuropathies 

CC induced by neurotoxins, peripheral neuropathies, multiple sclerosis, 

CC ischaemia and infarction, haemorrhages, schizophrenia, mania, dementia, 

CC depression, panic disorder, learning disabilities, ALS, altered 

CC behaviours e.g. disorders in feeding, sleep patterns, balance and 

CC perception, encephalitis, disorders in cardiovascular, neural/ sensory, 

CC reproductive and digestive systems, behavioural disorders and 

CC hyperprolif erative disorder. The present sequence represents human 

CC secreted protein fragment referred to in the disclosure of the invention 

XX 

SQ Sequence 225 AA; 

Query Match 18.4%; Score 363.5; DB 5; Length 225; 

Best Local Similarity 35.8%; Pred. No. 1.4e-30; 

Matches 76; Conservative 42; Mismatches 59; Indels 35; Gaps 4; 
Qy 86 LAISFSLKPVQAIAVLIMGCCPGGTISNIFTFWVDGDMDLSISMTTCSTVAALGMMPLCI 145 



Db 



I I • • I i i i • i i i • i I i i i i -ii- ■ iii ji — iii ii i i • i i • 

1 IALAFKLDEVAAVAVLLCGCCPGGNLSNLMSLLVDGDMNLSIIMTISSTLLALVLMPLCL 



60 



Qy 



146 YLYTWSW S LQQNLT I PYQNI GITLVCLT I PVAFGVYVN YRWPKQS KI I LKI 



196 



Db 



. . i . i . i i -i • • i i i i • i i • • i • - • i • i • 

61 WIYSWAWINTPIVQ — LLPLGTVTLTLCSTLIPIGLGVFIRYKYSRVADYIVKVSLWSLL 



118 



Qy 



197 GAWGGVLL L VVAVAG VVLAKGS WN SDITLLTISFIFPLI GH VT GFL LAL 

I : : I II : I I : I I I I : : I : I I 

119 VTLWLFIMTGTMLGPELLAS I PAAVYVI A 1 FMPLAGYASGYGLAT 



246 



Db 



164 



Qy 



Db 



247 FTHQSWQRCRTISLETGAQNIQMCITMLQLSF 278 

I I I : I I I I : I I : I : I : I : I : I 

165 LFHLPPNCKRTVCLETGSQNVQLCTAILKLAF 196 



Search completed: March 23, 2004, 14:35:49 
Job time : 62 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM protein - protein search, using sw model 
Run on: 



Title: 

Perfect score: 
Sequence : 



March 23, 2004, 14:34:43 ; Search time 23 Seconds 

(without alignments) 
846.217 Million cell updates/sec 

US-10-091-628-2 
1979 

1 MRANCSSSSACPANSSEEEL PGPMDCHRALEPVGHITSCE 377 



Scoring table: BLOSUM62 

Gapop 10.0 , Gapext 0.5 

Searched: 389414 seqs, 51625971 residues 

Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 



389414 



Post-processing : 



Minimum Match 0% 
Maximum Match 100% 
Listing first 45 summaries 



Database : Issued_Patents_AA: * 

1: /cgn2_6/ptodata/2/iaa/5A_COMB.pep: * 

2 : /cgn2_6/ptodata/2/iaa/5B_COMB.pep:* 
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4 : /cgn2_6/ptodata/2/iaa/6B_COMB.pep: * 

5 : /cgn2_6/ptodata/2/iaa/PCTUS_COMB. pep : * 

6: /cgn2_6/ptodata/2/iaa/backfilesl .pep: * 

Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 
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ALIGNMENTS 



RESULT 1 

US-08-176-126B-2 

; Sequence 2, Application US/08176126B 

; Patent No. 5589358 

; GENERAL INFORMATION: 

APPLICANT: Dawson, Paul A. 

TITLE OF INVENTION: ILEAL BILE ACID TRANSPORTER COMPOSITIONS AND 
; TITLE OF INVENTION: METHODS 
; NUMBER OF SEQUENCES: 5 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Arnold, White & Durkee 

STREET: P.O. Box 4433 
; CITY: Houston 

STATE: Texas 

COUNTRY : US 

ZIP: 77210 
; COMPUTER READABLE FORM: 

; MEDIUM TYPE: Floppy disk 



COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS/ASCII 
SOFTWARE: Patentln Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/176, 126B 
FILING DATE: 29-DEC-1993 
CLASSIFICATION: 435 
ATTORNEY/ AGENT INFORMATION : 
NAME: Parker, David L. 
REGISTRATION NUMBER: 32,165 
REFERENCE/ DOCKET NUMBER: WAKE: 002 /PAR 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: (512) 418-3000 
TELEFAX: (512) 474-7577 
TELEX: na 
INFORMATION FOR SEQ ID NO: 2: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 348 amino acids 
TYPE: amino acid 
TOPOLOGY: linear 
MOLECULE TYPE: protein 
US-08-176-126B-2 

Query Match 44.7%; Score 884; DB 1; Length 348; 

Best Local Similarity 46.9%; Pred. No. 3.5e-82; 

Matches 164; Conservative 74; Mismatches 102; Indels 10; Gaps 4; 

Qy 7 SSSACPANSS — EEELPVGLEVHGN— LELVFTWSTVMMGLLMFSLGCSVEIRKLWSHI 62 

: I I I I : : j : : I : I I : I : I I : : : I : I I I : I 1 : I I : I I : 
Db 3 NSSICNPNATICEGDSCIAPESNFNAILSVVMSTVLTILLALVMFSMGCNVELHKFLGHL 62 

Qy 63 RRPWGIAVGLLCQFGLMPFTAYLLAISFSLKPVQAIAVLIMGCCPGGTISNIFTFWVDGD 122 

I I I I I I I I I I I I I : I I I :: I ::: I : I I I I : I I I I I I I I I I I I I MINI 
Db 63 RRPWGIWGFLCQFGIMPLTGFVLSVAFGILPVQAVWLIQGCCPGGTASNILAYWVDGD 122 

Qy 123 MDLSISMTTCSTVAALGMMPLCIYLYTWSWSLQQNLTIPYQNIGITLVCLTIPVAFGVYV 182 

I I I I : I I I I I I I : I I I I I I I I ::: I I I : I I I : I I : I I I I I I : I : I I 

Db 123 MDLSVSMTTCSTLIALGMMPLCLFIYTKMWVDSGTIVIPYDSIGTSLVALVI PVSIGMYV 182 

Qy 183 NYRWPKQSKIILKIGAWGGVLLLWAVAGWLAKGSWNSDITLLTISFIFPLIGHVTGF 242 

I : : I I : : : I I I I I I I : : I : I : : : : I I I : I : : I : I I I : I : I : II 
Db 183 NHKWPQKAKIILKIGSIAGAILIVLIAWGGILYQSAWTIEPKLWIIGTIYPIAGYGLGF 242 

Qy 243 LLALFTHQSWQRCRTISLETGAQNIQMCITMLQLSFTAEHLVQMLSFPLAYGLFQLIDGF 302 

II I I II I I : : I I I I I I I : I I : : I I I I : I I : : I I I I : I I : 

Db 243 FLARIAGQPWYRCRTVALETGLQNTQLCSTIVQLSFSPEDLNLVFTFPLIYSIFQIAFAA 302 

Qy 303 LIVAAYQTYKRRLKNKHGKKNSGCTEVCHTRKS — TSSRETNAFLEVNEE 350 

: : : I I I I : MM: I : I : I I I : : I : 

Db 303 I LLGAYVAYKK CHGKNNTELQEKTDNEMEPRSSFQETNKGFQPDEK 348 



RESULT 2 
US-08-669-435-2 

; Sequence 2, Application US/08669435 
; Patent No. 5869265 
; GENERAL INFORMATION: 



APPLICANT: Dawson, Paul A. 

TITLE OF INVENTION: ILEAL BILE ACID TRANSPORTER COMPOSITIONS AND 
; TITLE OF INVENTION: METHODS 

; NUMBER OF SEQUENCES: 5 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Arnold, White & Durkee 
; STREET: P.O. Box 4433 

CITY: Houston 
STATE: Texas 
COUNTRY: US 
ZIP: 77210 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS/ASCII 
SOFTWARE: Patentln Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 
; APPLICATION NUMBER: US/08/669, 435 

FILING DATE: 26-JUN-1996 
; CLASSIFICATION: 
; PRIOR APPLICATION DATA: 

; APPLICATION NUMBER: US 08/176,126 

FILING DATE: 29-DEC-1993 
CLASSIFICATION: 
ATTORNEY/AGENT INFORMATION: 
; NAME: Parker, David L. 

REGISTRATION NUMBER: 32,165 
REFERENCE/DOCKET NUMBER: WAKE:002/PAR 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: (512) 418-3000 
TELEFAX: (512) 474-7577 
TELEX: na 
; INFORMATION FOR SEQ ID NO: 2: 

SEQUENCE CHARACTERISTICS: 
; LENGTH: 348 amino acids 

; TYPE: amino acid 

TOPOLOGY: linear 
MOLECULE TYPE: protein 
US-08-669-435-2 

Query Match 44.7%; Score 884; DB 2; Length 348; 

Best Local Similarity 46.9%; Pred. No. 3.5e-82; 

Matches 164; Conservative 74; Mismatches 102; Indels 10; Gaps 4; 



Qy 7 SSSACPANSS — EEELPVGLEVHGN — LELVFTWSTVMMGLLMFSLGCSVEIRKLWSHI 62 

: I I I I : : I ■ : : I : I I : I : I I : : : I : I I I : I I : I I : I I s 
Db 3 NSSICNPNATICEGDSCIAPESNFNAILSVVMSTVLTILLALVMFSMGCNVELHKFLGHL 62 

Qy 63 RRPWGIAVGLLCQFGLMPFTAYLLAISFSLKPVQAIAVLIMGCCPGGTISNIFTFWVDGD 122 

I I I I I I II I I I I I : I I I : : I : : : I : MM: III I I I I I I I III : I I I I I 
Db 63 RRPWGIWGFLCQFGIMPLTGFVLSVAFGILPVQAVWLIQGCCPGGTASNILAYWVDGD 122 

Qy 123 MDLSISMTTCSTVAALGMMPLCIYLYTWSWSLQQNLTIPYQNIGITLVCLTIPVAFGVYV 182 

I I I I : I I I I I I I : I I I I I I I I ::: I I I : I I I : I I : I I I I I I : I : I I 

Db 123 MDLS VSMTTCSTLLALGMMPLCLFI YTKMWVD SGT I VI P YDS I GTS LVALVI PVS I GMYV 182 

Qy 183 N YRWPKQS KI I LKI GAWGGVLLLWAVAGWLAKGSWNS DI TLLTI S FI FPLI GHVTGF 242 



I : : I I : : : I I I I I I I : : I : I : : : : I I I : I : : I : I I I : I : I : II 

Db 183 NHKWPQKAKIILKIGSIAGAILIVLIAWGGILYQSAWTIEPKLWIIGTIYPIAGYGLGF 242 

Qy 243 LLALFTHQSWQRCRTISLETGAQNIQMCITMLQLSFTAEHLVQMLSFPLAYGLFQLIDGF 302 

II I I I I i I : : I I I I I I I : I I : : I I I I : I I : : I I I I : I I : 

Db 243 FLARIAGQPWYRCRTVALETGLQNTQLCSTIVQLSFSPEDLNLVFTFPLIYSIFQIAFAA 302 

Qy 303 LIVAAYQTYKRRLKNKHGKKNSGCTEVCHTRKS — TSSRETNAFLEVNEE 350 

: : : I I I I : III I : I : I : I I I : : I : 

Db 303 ILL GAYVAYKK CHGKNNTELQEKTDNEMEPRSSFQETNKGFQPDEK 348 



RESULT 3 

PCT-US94-14431A-2 

; Sequence 2, Application PC/TUS9414431A 
; GENERAL INFORMATION: 
; APPLICANT: 

TITLE OF INVENTION: ILEAL BILE ACID TRANSPORTER COMPOSITIONS 
NUMBER OF SEQUENCES: 11 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Arnold, White & Durkee 

STREET: P. O. Box 4433 

CITY: Houston 

STATE: Texas 

COUNTRY: United States of America 
; ZIP: 77210 

COMPUTER READABLE FORM: 
; MEDIUM TYPE: Floppy disk 

; COMPUTER: IBM PC compatible 

; OPERATING SYSTEM: PC-DOS/MS-DOS/ASCI I 

SOFTWARE : Patentln Release #1.0, Version 

SOFTWARE: #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER : PCT/US94/ 14431A 

FILING DATE: 29-DEC-1994 
; CLASSIFICATION: 
; PRIOR APPLICATION DATA: 

APPLICATION NUMBER: USSN 08/176,126 

FILING DATE: 29-DEC-1993 
; CLASSIFICATION: 

ATTORNEY/AGENT INFORMATION: 

NAME: PARKER, DAVID L. 

REGISTRATION NUMBER: 32,165 

REFERENCE/ DOCKET NUMBER: WAKE005P — 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: (512) 418-3000 

TELEFAX: (713) 789-2679 

TELEX: 79-0924(1) GENERAL INFORMATION: 
; INFORMATION FOR SEQ ID NO: 2: 

SEQUENCE CHARACTERISTICS: 
; LENGTH: 34 8 amino acids 

TYPE: amino acid 
TOPOLOGY: linear 
MOLECULE TYPE: protein 
PCT-US94-14431A-2 



Query Match 



44.7%; Score 884; DB 5; Length 348; 



Best Local Similarity 46.9%; Pred. No. 3.5e-82; 

Matches 164; Conservative 74; Mismatches 102; Indels 10; Gaps 



4; 



Qy 7 SS SAC PANS S — EEELPVGLEVHGN — LELVFTWSTVMMGLLMFSLGCSVEIRKLWSHI 62 

: I I I | : : I : : I : I I : I : I I : : : I : I I I : I I : I I : I I : 
Db 3 NSSICNPNATICEGDSCIAPESNFNAILSWMSTVLTILLALVMFSMGCNVELHKFLGHL 62 

Qy 63 RRPWGIAVGLLCQFGLMPFTAYLLAISFSLKPVQAIAVLIMGCCPGGTISNIFTFWVDGD 122 

I I I I I I I I : I I I : : I : : : I : I I I I : I I I I I I I I : I I I I I 

Db 63 RRPWGIWGFLCQFGIMPLTGFVLSVAFGILPVQAVWLIQGCCPGGTASNIIAYWVDGD 122 

Qy 123 MDLSISMTTCSTVAALGMMPLCIYLYTWSWSLQQNLTIPYQNIGITLVCLTIPVAFGVYV 182 

I I I I : I I I I I I I : I I I I I I I I ::: I I I : I I I : I I : I I I 1 I I : I : I I 

Db 123 MDLSVSMTTCSTLLALGMMPLCLFIYTKMWVDSGTIVIPYDSIGTSLVALVIPVSIGMYV 182 

Qy 183 NYRWPKQSKIILKI G AWG G VL L L VVAVAG VVLAK G S WN SDITLLTISFIFPLIG HVT G F 242 

I : : I I : : : I I I I I I I : : I : I : : : : I I I : I : : I : I 1 I : I : I : II 
Db 183 NHKWPQKAKIILKIGSIAGAILIVLIAWGGILYQSAWTIEPKLWIIGTIYPIAGYGLGF 242 

Qy 243 LLALFTHQSWQRCRTISLETGAQNIQMCITMLQLSFTAEHLVQMLSFPLAYGLFQLIDGF 302 

II I I I I I I :: I I I I I I I : I I :: I I I I : I I : : I I I I : I I : 

Db 243 FLARIAGQPWYRCRTVALETGLQNTQLCSTIVQLSFSPEDLNLVFTFPLIYSIFQIAFAA 302 

Qy 303 LIVAAYQTYKRRLKNKHGKKNSGCTEVCHTRKS — TSSRETNAFLEVNEE 350 

: : : I I I I : lllh I : I : I I I : : I : 

Db 303 I LLGAYVAYKK CHGKNNTELQEKTDNEMEPRSSFQETNKGFQPDEK 34 8 



RESULT 4 

US-08-176-126B-4 

; Sequence 4, Application US/08176126B 

; Patent No. 5589358 

; GENERAL INFORMATION: 

APPLICANT : Dawson, Paul A. 

TITLE OF INVENTION: ILEAL BILE ACID TRANSPORTER COMPOSITIONS AND 
; TITLE OF INVENTION: METHODS 

NUMBER OF SEQUENCES: 5 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Arnold, White & Durkee 

STREET: P.O. Box 44 33 
; CITY: Houston 

STATE: Texas 

COUNTRY: US 

ZIP: 77210 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS/ASCII 

SOFTWARE: Patentln Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 08/ 176, 12 6B 

FILING DATE: 29-DEC-1993 

CLASSIFICATION: 435 
ATTORNEY/AGENT INFORMATION: 
; NAME : Parker, David L. 

REGISTRATION NUMBER: 32,165 

REFERENCE/ DOCKET NUMBER: WAKE:002/PAR 



TELECOMMUNICATION INFORMATION: 
TELEPHONE: (512) 418-3000 
TELEFAX: (512) 474-7577 
TELEX: na 
INFORMATION FOR SEQ ID NO: 4: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 348 amino acids 
TYPE: amino acid 
TOPOLOGY: linear 
MOLECULE TYPE: protein 
US-08-176-126B-4 

Query Match 43.5%; Score 860.5; DB 1; Length 348; 

Best Local Similarity 45.6%; Pred. No. 8.8e-80; 

Matches 160; Conservative 68; Mismatches 104; Indels 19; Gaps 4; 

Qy 5 CSSSSACPANSSEEELPVGLEVHGNLELVFTWSTVMMGLLMFSLGCSVEIRKLWSHIRR 64 

| | : | I : : I : I : I I : : : I : I I I : I I : I I I s I I I : I 

Db 14 CSGASCWPESNFNNI LSWLSTVLTILLALVMFSMGCNVEIKKFLGHIKR 64 

Qy 65 PWGIAVGLLCQFGLMPFTAYLLAISFSLKPVQAIAVLIMGCCPGGTISNIFTFWVDGDMD 12 4 

I I I I II 11111:11 I : : I : : : I : I : I I : I I I : I I I I I I I III : I 

Db 65 PWGICVGFLCQFGIMPLTGFILSVAFDILPLQAVWLIIGCCPGGTASNILAYWVDGDMD 124 

Qy 125 LSI SMTTCSTVAALGMMPLCI YLYTWSWSLQQNLTI PYQNI GITLVCLTI PVAFGVYVNY 184 

I I : II I I I I I : I I I I I I I I :: I I I : : I I I I I I : I I I : I I : I : : I I : 

Db 125 LSVSMTTCSTLLALGMMPLCLLIYTKMWVDSGSIVIPYDNIGTSLVALWPVSIGMFVNH 184 

Qy 185 RW PKQSKIILKI GA VVGG VLL L VVAVAGVVLAKG S WN SDITLLTISFIFPLI GH VT G FL L 244 

: I I : : : I I I I I I I : : I : I : : : : I I I : I : : I I I I I I : I : MM 

Db 185 KWPQKAKI I LKI GSI AGAILIVLI AWGGI LYQSAWI I APKLWI I GTI FPVAGYSLGFLL 244 

Qy 24 5 ALFTHQSWQRCRTISLETGAQNIQMCITMLQLSFTAEHLVQMLSFPLAYGLFQLIDGFLI 304 

I I II I I :: M I I I I : I I :: M M I I I : MM I MM : 

Db 245 ARIAGLPWYRCRTVAFETGMQNTQLCSTIVQLSFTPEELNWFTFPLIYSIFQLAFAAIF 304 

Qy 305 VAAYQTYKRRLKNKHGKKNSGCTEVCHTRKSTSSRETNAFLEVNEEGAITP 355 

: I II : III: I I : : : I : I I I 

Db 305 LGFYVAYKK CHGKNKAEIPE SKENGTEPESSFYKAN — GGFQP 345 



RESULT 5 
US-08-669-435-4 

; Sequence 4, Application US/08669435 

; Patent No. 5869265 

; GENERAL INFORMATION: 

APPLICANT: Dawson, Paul A. 

TITLE OF INVENTION: ILEAL BILE ACID TRANSPORTER COMPOSITIONS AND 

TITLE OF INVENTION: METHODS 

NUMBER OF SEQUENCES: 5 

CORRESPONDENCE ADDRESS: 
; ADDRESSEE: Arnold, White & Durkee 

; STREET: P.O. Box 4433 

CITY: Houston 
; STATE: Texas 

; COUNTRY: US 

ZIP: 77210 



; COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS/ASCII 

SOFTWARE: Patentln Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 
; APPLICATION NUMBER: US/08/669, 435 

; FILING DATE: 2 6-JUN-1996 

CLASSIFICATION: 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 08/176,126 

FILING DATE: 29-DEC-1993 

CLASSIFICATION: 
; ATTORNEY/AGENT INFORMATION: 

NAME: Parker, David L. 

REGISTRATION NUMBER: 32,165 

REFERENCE/ DOCKET NUMBER: WAKE:002/PAR 
TELECOMMUNICATION INFORMATION: 
; TELEPHONE: (512) 418-3000 

; TELEFAX: (512) 474-7577 

TELEX: na 
; INFORMATION FOR SEQ ID NO: 4: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 348 amino acids 
; TYPE: amino acid 

; TOPOLOGY: linear 

MOLECULE TYPE: protein 
US-08-669-435-4 

Query Match 43.5%; Score 860.5; DB 2; Length 348; 

Best Local Similarity 45.6%; Pred. No. 8.8e-80; 

Matches 160; Conservative 68; Mismatches 104; Indels 19; Gaps 4; 

Qy 5 CSSSSACPANSSEEELPVGLEVHGNLELVFTWSTVMMGLLMFSLGCSVEIRKLWSHIRR 64 

I I : I I : : I : I : I I : : : I : I I I : I I : I I I : I I I : I 

Db 14 CSGASCWPESNFNNI LS WLSTVLT I LLALVMFSMGCNVEI KKFLGHI KR 64 

Qy 65 PWGIAVGLLCQFGLMPFTAYLLAISFSLKPVQAIAVLIMGCCPGGTISNIFTFWVDGDMD 124 

I ! I I I I I I I I I : I I I : : I : : : I : I : I I : II I : I I I I I I I I I I : I I I I I I I 

Db 65 PWGI CVGFLCQFGIMPLTGFI LSVAFDI LPLQAWVLI I GCCPGGTASNI LAYWVDGDMD 124 

Qy 125 LSISMTTCSTVAALGMMPLCIYLYTWSWSLQQNLTIPYQNIGITLVCLTIPVAFGVYVNY 184 

I I : I I I I I I I : I I I I I I I I : : I I I : : I I I I I I : I I I : I I : I : : I I : 

Db 125 LSVSMTTCSTLLALGMMPLCLLIYTKMWVDSGSIVIPYDNIGTSLVALWPVSIGMFVNH 184 

Qy 185 RWPKQSKIILKIGAWGGVLLLWAVAGWIAKGSWNSDITLLTISFIFPLIGHVTGFLL 244 

: I I : : : I I I I I II : : I : I : : : : I I I : I : : I I I I I I : I : I I II 

Db 185 KWPQKAKI I LKI GS I AGAI LI VLI AWGGI LYQSAWI I APKLWI I GTI FPVAGYS LGFLL 244 

Qy 245 ALFTHQSWQRCRTISLETGAQNIQMCITMLQLSFTAEHLVQMLSFPLAYGLFQLIDGFLI 304 

I I I I I I : : I I I I I I : I I : : I I I I I I I : : I I I I : I I I : 

Db 245 ARI AGLPWYRCRTVAFETGMQNTQLCSTI VQLS FT PEELNWFT FPLI YS I FQLAFAAI F 304 

Qy 3 05 VAAYQTYKRRLKNKHGKKNSGCTEVCHTRKSTSSRETNAFLEVNEEGAITP 355 

: I I I : III: I I : : : I : I I I 

Db 305 LGFYVAYKK CHGKNKAEIPE SKENGTEPESSFYKAN — GGFQP 345 



RESULT 6 

PCT-US94-14431A-4 

; Sequence 4, Application PC/TUS94 14431A 
; GENERAL INFORMATION: 
APPLICANT: 

TITLE OF INVENTION: ILEAL BILE ACID TRANSPORTER COMPOSITIONS 
NUMBER OF SEQUENCES : 11 
; CORRESPONDENCE ADDRESS: 

; ADDRESSEE: Arnold, White & Durkee 

STREET: P. O. Box 4433 
CITY: Houston 
; STATE: Texas 

; COUNTRY: United States of America 

ZIP: 77210 
COMPUTER READABLE FORM: 
; MEDIUM TYPE: Floppy disk 

; COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS/ASCII 

SOFTWARE: Patentln Release #1.0, Version 
; SOFTWARE: #1.25 

; CURRENT APPLICATION DATA: 

APPLICATION NUMBER: PCT/US94/14431A 

FILING DATE: 29-DEC-1994 

CLASSIFICATION: 
; PRIOR APPLICATION DATA: 

; APPLICATION NUMBER: USSN 08/176,126 

FILING DATE: 29-DEC-1993 

CLASSIFICATION: 
ATTORNEY/AGENT INFORMATION: 
; NAME: PARKER, DAVID L. 

; REGISTRATION NUMBER: 32,165 

REFERENCE/DOCKET NUMBER: WAKE005P — 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: (512) 418-3000 

TELEFAX: (713) 789-2679 

TELEX: 79-0924(1) GENERAL INFORMATION: 
; INFORMATION FOR SEQ ID NO: 4: 

SEQUENCE CHARACTERISTICS: 
; LENGTH: 34 8 amino acids 

; TYPE: amino acid 

TOPOLOGY: linear 
; MOLECULE TYPE: protein 
PCT-US94-14431A-4 

Query Match 43.5%; Score 860.5; DB 5; Length 348; 

Best Local Similarity 45.6%; Pred. No. 8.8e-80; 

Matches 160; Conservative 68; Mismatches 104; Indels 19; Gaps 4; 
Qy 5 CSSS SAC PANS SEEELPVGLEVHGNLELVFTVVSTVMMGLLMFSLGCSVEIRKLWSHIRR 64 



Db 



14 CSGASCWPESNFNNI 




QY 



65 PWGI AVGLLCQFGLMP FTAYLLAI S FS LKPVQAI AVLIMGCCP GGT I SN I FT FWVDGDMD 124 



Db 



65 PWGI CVGFLCQFGIMPLTGFI LSVAFDILPLQAVWLI I GCCPGGTASNI LAYWVDGDMD 124 



Qy 


125 


Db 


125 


Qy 


185 


Db 


185 


Qy 


245 


Db 


245 


Qy 


305 


Db 


305 



LSISMTTCSTVAALGMMPLCIYLYTWSWSLQQNLTIPYQNIGITLVCLTIPVAFGVYVNY 184 

I I : I I I I I I I : I I I I I I I I : : I I I : : I I I I I I : I I I : I I : I : : I I : 

LSVSMTTCSTLLALGMMPLCLLIYTKIW^DSGSIVIPYDNIGTSLVALWPVSIGMFVNH 184 



I : : : I I I I I I I : : I : I : : : : I I I : I : : I i I I I I : I : II 



I I I II :: I II I I I : I I :: I I I I I I I : : I I I I : I I I 



III: III: I I : : : I : I I I 

'YVAYKK CHGKNKAEIPE SKENGTEPESSFYKAN--GGFQP 34 5 



RESULT 7 

US-09-328-352-5100 

; Sequence 5100, Application US/09328352 

; Patent No. 6562958 

; GENERAL INFORMATION: 

; APPLICANT: Gary L. Breton et al . 

; TITLE OF INVENTION: NUCLEIC ACID AND AMINO ACID SEQUENCES RELATING TO 
ACINETOBACTER 

; TITLE OF INVENTION: BAUMANNII FOR DIAGNOSTICS AND THERAPEUTICS 
; FILE REFERENCE: GTC99-03PA 

; CURRENT APPLICATION NUMBER: US/09/328 , 352 
; CURRENT FILING DATE: 1999-06-04 
; NUMBER OF SEQ ID NOS : 8252 
; SEQ ID NO 5100 
; LENGTH: 32 5 
TYPE: PRT 

; ORGANISM: Acinetobacter baumannii 
US-09-328-352-5100 

Query Match 14.2%; Score 280.5; DB 4; Length 325; 

Best Local Similarity 25.7%; Pred. No. 1.9e-20; 

Matches 71; Conservative 70; Mismatches 116; Indels 19; Gaps 7; 

Qy 41 MMGLLMFS LGCSVEIRKLWSHI RRPWGI AVGLLCQFGLMPFTAYLLAI S FS LKPVQAI AV 100 

: : I : : I : I : : : : : I : : I : : II ' : I I I : : I I : I I I : I 

Db 56 I LG 1 1 ML GMGMTMT VD D FKGVLQ S P KAVL I GWAQ FWMP G LAF ILCKLFNLPP E I AVGV 115 

Qy 101 LIMGCCPGGTISNIFT FWVDGDMDLSISMTTCSTVAALGMMPLCIYLYTWSWSLQQNLTI 160 

: : : I I I I I I I I I : I : I : : I I : : I : I I : I : I II I II 
Db 116 I LVGCC P GGTASNVI T YMAKGNVAL S VACT S VST LLAP VLT PT I FYLLAS QW LKI 170 

Qy 161 PYQNIGIT L V- C L T I P VAFG VYVN Y RW P KQ S K 1 1 L K I GAWGG VL L L WAVAG WLAK G S 219 

: : I : : : : : | : I : : . : I : : : : : I I : : I I I : : I I 
Db 171 DAASMFISILQWLLPIVIGLILRTWLKRQVESYIQVMPLV-SVIAIVAIVAAII — GGS 227 

Qy 220 WNSDITLLTISFIFPLIGHVTGFLLALFTHQSWQRCRTISLETGAQNIQMCITMLQ 275 

I : : I : : : | : : I I II : : I : : I I I I : : : 

Db 228 KAAILQSGLLIIAWILHNGLGYLLGFAAARFFKLPYVDSKAIAVEVGMQNSGLGVALAA 287 



Qy 



276 L S FT AE H L VQML S F P LAYG LFQLIDGFLI VAAYQ T Y 311 



: I I : : I : I : I I III 

Db 288 VHFAASPITAVPS— AIFSLWHNISG PALATY 317 



RESULT 8 

US-09-2 52-991A-17715 

Sequence 17715, Application US/09252991A 
Patent No. 6551795 
GENERAL INFORMATION: 
APPLICANT: Marc J. Rubenfield et al . 

TITLE OF INVENTION: NUCLEIC ACID AND AMINO ACID SEQUENCES RELATING TO 
PSEUDOMONAS 

TITLE OF INVENTION: AERUGINOSA FOR DIAGNOSTICS AND THERAPEUTICS 
FILE REFERENCE: 107196.136 

CURRENT APPLICATION NUMBER: US/ 09/2 52 , 991A 
CURRENT FILING DATE: 1999-02-18 
PRIOR APPLICATION NUMBER: US 60/074,788 
PRIOR FILING DATE: 1998-02-18 
PRIOR APPLICATION NUMBER: US 60/094,190 
PRIOR FILING DATE: 1998-07-27 
NUMBER OF SEQ ID NOS : 33142 
SEQ ID NO 17715 
LENGTH: 315 
TYPE: PRT 

ORGANISM: Pseudomonas aeruginosa 
US-09-252-991A-17715 

Query Match 13.7%; Score 271.5; DB 4; Length 315; 

Best Local Similarity 25.7%; Pred. No. 1.6e-19; 

Matches 75; Conservative 74; Mismatches 116; Indels 27; Gaps 8; 

Qy 32 LVFTWSTVMMGLLMFSLGCSVEIRKLWSHIRRPWGIAVGLLCQFGLMPFTAYLLAISFS 91 

I I : : I I : I I : I : : : II : : I : I I I : I I I : I I 

Db 38 LPLTAAIAPLLGLVMFGMGLTLKGEDFREVARHPIRVLIGVLAQFVIMPGLAWLLCSLLQ 97 

Qy 92 LKPVQAIAVLIMGCCPGGTISNIFTFWVDGDMDLSISMTTCSTVAALGMMPLCIYLYTWS 151 

I I : I : : : I I I I I I I I I : I : I I : I I : : : I : : I : I : I : : I : 

Db 98 L P AE I AVGVI LVGCC P GGTASN VMT WL S RGDVAL S VAI T S VTT LLAP L VT P ALVWLLAS A 157 

Qy 152 W S LQQN LT I P YQN I G I T LV- CLT I P VAFGVYVN Y RW P KQ S K 1 1 L K I GAWGGVL LL WAV 210 

| | : : : : : : : : : | : | | : ::::::: : | | : | | : 

Db 158 W LPVS FAAMFLS I LQWLVP I ALGLLAQRLLGERT RQVAEVLP LV- S VFS IWI I 211 

Qy 211 AGWLAKG S WN S D I T LLT I S FI F PLIGHVTGFLLALFTHQSWQRCRTISLETGA 264 

I I I I : : : I I : : : I : I : : I I I : I : : : : I I 

Db 212 AAWAAS Q ARI AE S G L L I MAWML HN GFGLLLGYLTGKLT GM P LAQ R KALAIEVGM 267 

Qy 2 65 QN I QMC I TMLQL S FT AEH LVQML S FP LA- YGL FQL I DG FL I VAAYQT YKRRL 315 

II : I : I : : | | : : : : | | : | : | | | 

Db 268 QN SGLGAALAimiFAPIAAVPSALFSVWHNLSGSLLAALF RRL 310 



RESULT 9 

US-09-540-236-2883 

; Sequence 2883, Application US/09540236 
; Patent No. 6673910 
; GENERAL INFORMATION: 



; APPLICANT: Gary L. Breton et al . 

; TITLE OF INVENTION : NUCLEIC ACID AND AMINO ACID SEQUENCES RELATING TO 
MO RAX EL LA CATARRHAL I S 

TITLE OF INVENTION: FOR DIAGNOSTICS AND THERAPEUTICS 
FILE REFERENCE: 27 09.2005-001 
CURRENT APPLICATION NUMBER: US/ 09/54 0 , 236 
CURRENT FILING DATE: 2000-04-04 
NUMBER OF SEQ ID NOS : 3840 
SEQ ID NO 2883 
LENGTH: 323 
TYPE: PRT 

ORGANISM: M. catarrhalis 
US-09-540-236-2883 

Query Match 13.5%; Score 267.5; DB 4; Length 323; 

Best Local Similarity 24.2%; Pred. No. 4.1e-19; 

Matches 67; Conservative 69; Mismatches 112; Indels 29; Gaps 5; 

LVFTWST VMMGLLMFSLGCSVEIRKLWSHIRRPWGIAVGLLCQFG 77 

: | | : : : | : : I : : I : I : : : : I : I I : : I : 

WFALIATQFPDAFKQLVPWIPYLLGIVMLGMGLTLTFKDFGEVTKNPKAVIVGVILQYV 87 

LMPFTAYLLAISFSLKPVQAIAVLIMGCCPGGTISNIFTFWVDGDMDLSISMTTCSTVAA 137 

: | | | : | I : I I I I I I : : : I I I I I I I I I : I I I : I I : : I I I I : I 
VMPVIAFLLVQAFRLPPDLAI GVI LVGCCPGGTSSNVITFLAKGNTALSVACTTLSTLLA 147 

LGMMPLCIYLYTWSWSLQQNLTIPYQNIGITLVCLTIPVAFGVYVNYRWPKQSKIILKIG 197 
: I ||: III: : :: I : : I I I : : :::::: 



I I : : I I : : II I 



Qy 


32 


Db 


28 


Qy 


78 


Db 


88 


Qy 


138 


Db 


148 


Qy 


198 


Db 


204 


Qy 


252 


Db 


259 



I I 1 1 



RESULT 10 

US-09-252-991A-23958 

; Sequence 23958, Application US/09252991A 
; Patent No. 6551795 
; GENERAL INFORMATION: 

; APPLICANT: Marc J. Rubenfield et al. 

; TITLE OF INVENTION: NUCLEIC ACID AND AMINO ACID SEQUENCES RELATING TO 
PSEUDOMONAS 

; TITLE OF INVENTION: AERUGINOSA FOR DIAGNOSTICS AND THERAPEUTICS 

FILE REFERENCE: 107196.136 
; CURRENT APPLICATION NUMBER: US/ 09/252 , 991A 
; CURRENT FILING DATE: 1999-02-18 
; PRIOR APPLICATION NUMBER: US 60/074,788 
; PRIOR FILING DATE: 1998-02-18 
; PRIOR APPLICATION NUMBER: US 60/094,190 
; PRIOR FILING DATE: 1998-07-27 
; NUMBER OF SEQ ID NOS: 33142 
; SEQ ID NO 23958 



LENGTH: 308 
TYPE: PRT 

ORGANISM: Pseudomonas aeruginosa 
US-09-252-991A-23958 

Query Match 13.0%; Score 257.5; DB 4; Length 308; 

Best Local Similarity 26.2%; Pred. No. 4.1e-18; 

Matches 75; Conservative 60; Mismatches 12 8; Indels 23; Gaps 7; 

TWSTVMMGLLMFSLGCSVEIRKLWSHIRRPWGIAVGLLCQFGLMPFTAYLLAISFSL 92 
I : : : I : : I III: : I I : I I I : I I I : I : I : 1 = 1 



: | : : : : : | | | I : I : : : I I : I : I : : I : : I I : Ml : I 

E/yUAVGMMLLAASPGGTTANLYSHLAHGDVALNITLTAWSVIAILTMPLIWL 131 

SLQ QN LT I P YQN I G I T LVC LT I P VAFGVYVN YRW PKQSKIILKI GAWGGVLL L 206 

Ml I : : : : I : I II I : I II : : I : ' : I I I 



Qy 


33 


Db 


17 


Qy 


93 


Db 


77 


Qy 


153 


Db 


132 


Qy 


207 


Db 


192 


Qy 


263 


Db 


249 



loAKGSWNSDITLLTI SFIFPLIGHVTGFLLALFTHQSWQRCRTISLET 262 

I I I I : : I : : I I : I : : : I : I 



I I : I I I I : : I III: I I I : 



RESULT 11 

US-09-489-039A-8584 

; Sequence 8584, Application US/09489039A 
; Patent No. 6610836 
; GENERAL INFORMATION: 

APPLICANT: Gary Breton et. al 
; TITLE OF INVENTION: NUCLEIC ACID AND AMINO ACID SEQUENCES RELATING TO 
KLEBSIELLA 

; TITLE OF INVENTION: PNEUMONIAE FOR DIAGNOSTICS AND THERAPEUTICS 

; FILE REFERENCE: 2709.2004001 

; CURRENT APPLICATION NUMBER: US/09/489 f 039A 

; CURRENT FILING DATE: 2000-01-27 

; PRIOR APPLICATION NUMBER: US 60/117,747 

; PRIOR FILING DATE: 1999-01-29 

; NUMBER OF SEQ ID NOS : 14342 

; SEQ ID NO 8584 

LENGTH: 349 
; TYPE: PRT 

; ORGANISM: Klebsiella pneumoniae 
US-09-489-039A-8584 

Query Match 12.0%; Score 237.5; DB 4; Length 349; 

Best Local Similarity 26.5%; Pred. No. 5.4e-16; 

Matches 79; Conservative 57; Mismatches 109; Indels 53; Gaps 

Qy 37 VSTVMMGLLMFSLGCSVEIRKLWSHIRRPWGIAVGLLCQFGLMPFTAYLLAISFSLKPVQ 96 

| : I : : I I : I I : I : : : : I I : I I : : : I I I : I I I : I : I 



Db 



74 VTTLLM-LIMFGMGVHLKLEDFKRVLSRPAPVAAGIFLHYLVMPL7\AWLLALLFHMPPEL 132 



Qy 97 AIAVLIMGCCPGGTISNIFTFWVDGDMDLSISMTTCSTVAALGMMPLCIYLYTWSWSLQQ 156 

: : : : : | I I I I : I I I : I I : : : : : I I : : II II 

Db 133 SAGMVLVGSVASGTASNVMIFLAKGDVALSVTISSVSTLVGWATPLLTRLYV 185 

Qy 157 NLTIPYQNIGITLVCL TIPVAFGVYVNYRWPKQSK IILKIGAW 200 

: I : I : I I I I : I I : I : : I I I I : I I I I 

Db 186 DAHIQVDVMGMLLSILQIWIPIALGLIVHHLLPKWKAVEPFLPAFSMVCILAIISAW 245 

Qy 201 GGVLLLWAVAGWLAKGSWNSDITLLTISFIFPLIGHVTGFLLALFTHQSWQRCRTISL 260 

I : : I I I : : : I I I I III : III::: 
Db 246 AGSAAHIASVGLWIIAVILHNTIGLL GGYWGGRLFGFDEST CRTLAI 293 

Qy 261 ET GAQN I QMC I TMLQL S FT AEHL VQML S FP LA YGLFQLI DGFLI VAAYQTYK 312 

I I II : : : : I III : : : : | | : | I : I 

Db 294 EVGMQNSGLAAALGKIYFG PLAALPGALFSVWHNLSGSLL-AGYWSGK 340 



RESULT 12 

US-09-543-681A-6013 

; Sequence 6013, Application US/09543681A 

; Patent No. 6605709 

; GENERAL INFORMATION: 

; APPLICANT: GARY BRETON 

; TITLE OF INVENTION: NUCLEIC ACID AND AMINO ACID SEQUENCES RELATING TO PROTEUS 
MIRABILIS FOR 

; TITLE OF INVENTION: DIAGNOSTICS AND THERAPEUTICS 

; FILE REFERENCE: 2709.1002-001 

; CURRENT APPLICATION NUMBER: US/ 09/54 3 , 68 1A 

; CURRENT FILING DATE: 2000-04-05 

; PRIOR APPLICATION NUMBER: US 60/128,706 

; PRIOR FILING DATE: 1999-04-09 

; NUMBER OF SEQ ID NOS: 8344 

; SEQ ID NO 6013 

LENGTH: 659 
; TYPE: PRT 

ORGANISM: Proteus mirabilis 
US-09-543-681A-6013 

Query Match 6.2%; Score 122; DB 4; Length 659; 

Best Local Similarity 27.5%; Pred. No. 0.00089; 

Matches 56; Conservative 24; Mismatches 66; Indels 58; Gaps 11; 

Qy 67 GIAVG — LLCQFGLMPFTAYLLAISFSLKPVQAIAVLI MGCCP 107 

111111:11111 II II M I 

Db 45 GPLVGPVLAAQMGYLPGTLRLLAGWLAGAVQDFMVLFISTRRNGNSLGEMIKKEMGPVP 104 

Qy 108 GGTISNIFTFWVDGDMDLSISMTTCSTVAALGMMPLCIYLYTWSWSLQQNLTIPYQNIGI 167 

III: | : | : : : : I II I I I : 

Db 105 -GTIALFGCFLI MIIILAVLALIWKALAESP W GV 138 

Qy 168 TLVCLT I PVA- - FGVYVNYRWPKQS KI I LKI GAV- VGGVLLLWAV — AGWLAKGSWNS 222 

II I : I : I I : I : I I : : | I I | : : | | : I : I I : : I 

Db 139 FTVCSTVPIALFMGIYMRYIRPG RVGEVSVIGIVLLIAAIWFGGVIASDPYWGP 192 



Qy 



223 DITLLTISFIFPLIGHVTGFLLAL 246 



: I : MM: I : I I 
Db 193 ALTFKDTTITFGLIGY — AFISAL 214 



RESULT 13 

US-09-489-039A-8942 

; Sequence 8942, Application US/09489039A 

; Patent No. 6610836 

; GENERAL INFORMATION: 

; APPLICANT: Gary Breton et . al 

; TITLE OF INVENTION: NUCLEIC ACID AND AMINO ACID SEQUENCES RELATING TO 
KLEBSIELLA 

; TITLE OF INVENTION: PNEUMONIAE FOR DIAGNOSTICS AND THERAPEUTICS 

; FILE REFERENCE: 2709.2004001 

; CURRENT APPLICATION NUMBER: US/09/489, 039A 

; CURRENT FILING DATE: 2000-01-27 

; PRIOR APPLICATION NUMBER: US 60/117,747 

; PRIOR FILING DATE: 1999-01-29 

; NUMBER OF SEQ ID NOS : 14342 

; SEQ ID NO 8942 

LENGTH: 722 
; TYPE: PRT 

; ORGANISM: Klebsiella pneumoniae 
US-09-4 89-039A-8942 

Query Match 5.8%; Score 115; DB 4; Length 722; 

Best Local Similarity 26.5%; Pred. No. 0.0053; 

Matches 54; Conservative 26; Mismatches 66; Indels 58; Gaps 11; 

Qy 67 GIAVG — LLCQFGLMP FTAYLLAI S FS LKPVQAI AVL I MGCCP 107 

I I I I I I : I I : I I I I I I I III 

Db 108 G P LVG P VLAAQMG Y L P GT LW L LAG WLAG AVQ D FMVL F I S S RRN GAS L G EMI KQ EM G P VP 167 

Qy 108 GGTISNIFT FWVDGDMDLSISMTTCSTVAALGMMPLCIYLYTWSWSLQQNLTIPYQNIGI 167 

I : I : | : | : : : : I I I I I I : 

Db 168 -GSIALFGCFLI MIIILAVLALIWKALAESP W GV 201 

Qy 168 TLVCLTI PVA — FGVYVNYRWPKQSKI ILKIGAV-VGGVLLLVVAV — AGWLAKGSWNS 222 

II |:|:| 1:1: : I : : | | | | : : | | | : : ||: I 

Db 202 FTVCSTVPIALFMGIYMRFLRPG RVGEVSVIGIVLLVASIWFGGVIAHDPYWGP 255 

Qy 223 DITLLTISFIFPLIGHVTGFLLAL 246 

: I : I II I : I : I I 

Db 256 ALTFKDTTITFTLIGY — AFISAL 277 



RESULT 14 

US-09-328-352-4577 

; Sequence 4577, Application US/09328352 

; Patent No. 6562958 

; GENERAL INFORMATION: 

; APPLICANT: Gary L. Breton et al. 

; TITLE OF INVENTION: NUCLEIC ACID AND AMINO ACID SEQUENCES RELATING TO 
ACINETOBACTER 

; TITLE OF INVENTION: BAUMANNII FOR DIAGNOSTICS AND THERAPEUTICS 
; FILE REFERENCE: GTC99-03PA 

; CURRENT APPLICATION NUMBER: US/09/328,352 



; CURRENT FILING DATE: 1999-06-04 
; NUMBER OF SEQ ID NOS : 8252 
; SEQ ID NO 4577 

LENGTH: 324 

TYPE : PRT 

; ORGANISM: Acinetobacter baumannii 
US-09-328-352-4577 



Query Match 5.3%; Score 104; DB 4; Length 324; 

Best Local Similarity 23.6%; Pred. No. 0.023; 

Matches 71; Conservative 48; Mismatches 124; Indels 58; Gaps 15; 



Qy 


26 


VHGNLELVFTWSTVMMGLLMFSLGCSVEIRKLWSHIRRPWGI AVGL 


72 


Db 


29 


II | :: | | : : I I 1 : : 1 1 : 1 : 1 1 

VSGQAAQYFNTLTTVAIAILFFLHGAKLSREAVIEGILH-WKMHLLVFAITFFIFPAIGL 


87 


Qy 


73 


LCQFGLMPFTAYLLAISFSLKPVQAIAVLIMGCCPGGTISNI-FTFWVDGDMDLSISMTT 

I : |:| 1 1 II 1 hi 1 1 |:: :: : 

t ak-dtt t dt t mm vtaTTV t fmp ft.p s tvo s s t AFT SVAKGNVAGAVCSAS 


131 


Db 


Q Q 
OO 


137 


QY 


132 


CSTVAALGMMPLCIYLYTWSWSLQQNLTIPYQNIGITLVCLTIPVAFGVYVN-YRWPKQS 
i i . . ii i i i i • i • 1 1 • 1*1 


190 


Db 


138 


| : : : | : : : I I 1 III* 1 • 1 1 • 1 • 1 

FSNLVGMFITPVLVSFFILGQS-QHGFDPTSSIIQITLL-LLVPFVLGQLLRPYVFPYMA 


195 


Qy 


191 


KI I LKI GAWGGVLLLW — AVAGWLAKGSWN SDITLLTI — SFIFPLIGHVTGF 

| : : I 1 : 1 : II 1 : 1 1 : 1 1 1 : : 1 1 1 1 1 : : 1 

KVPSIVKAFDQGSILMWYGAFSGAWA-GLWHQVSWKTLLLLTIACSVLLTII M 


242 


Db 


196 


249 


Qy 


243 


LLALFTHQS — WQRCRTI SLETGAQNIQMC 1 TMLQ L S FT AEH L VQM 

|| | | : : : : 1 1 1 : 1 1 : 1 : 1 = 1 1 : 1 ' 

LLALYLPRAFGFNRADQITVFFCGSKKTLASGVPMAQILFIGQPLGMIVLPIMIFHQIQL 


286 


Db 


250 


309 


Qy 


287 


L 287 




Db 


310 


M 310 




RESULT 15 
US-09-976- 


-594 


-489 





; Sequence 489, Application US/09976594 

; Patent No. 6673549 

; GENERAL INFORMATION: 

; APPLICANT: Furness, Michael 

; APPLICANT: Buchbinder, Jenny 

; TITLE OF INVENTION: GENES EXPRESSED IN C3A LIVER CELL CULTURES TREATED WITH 
STEROIDS 

; FILE REFERENCE: PA- 0041 US 

; CURRENT APPLICATION NUMBER: US/09/976, 594 

; CURRENT FILING DATE: 2001-10-12 

; PRIOR APPLICATION NUMBER: 60/240,409 

; PRIOR FILING DATE: 2000-10-12 

; NUMBER OF SEQ ID NOS: 1143 

; SOFTWARE: PERL Program 

; SEQ ID NO 489 
LENGTH: 697 
; TYPE: PRT 

; ORGANISM: Homo sapiens 



; FEATURE : 

; NAME/ KEY : misc_feature 

OTHER INFORMATION: Incyte ID No. 6673549 2120743CD1 
US-09-976-594-489 

Query Match 5.2%; Score 103; DB 4; Length 697; 

Best Local Similarity 22.0%; Pred. No. 0.085; 

Matches 82; Conservative 58; Mismatches 141; Indels 92; Gaps 19; 

Qy 22 VGLEVH GNLELVFTWSTVMMGL LMFS L- GCSVEI RKLWSHI RRPWGIAV 70 

: I I I : | M : : : : | : I I : I : : I I : I : 

Db 50 LGLYVRWEKTANSLILVIFILGLFVLGIASILYYYFSMEAASLSLSNLW FGFLL 103 

Qy 71 GLLCQFGLMPF TAYLLAI SFSLKPVQAIAVLIMGCC PGGTISNIFTFWV 119 

I I I I I I I I I I I : : : : II I I : I 

Db 104 GLLCFLDNSSFKNDVKEESTKYLLLTSIVLRILCSLVERISGYVRHRP — TLLTTVEF — 159 

Qy 12 0 DGDMDL SISMTTCSTVAALGMMPLCIYLYTWSWSLQQNLTIPYQNIGITLVCLTI— 17 4 

:: I : I : I I : | :: I : I I : : hill 

Db 160 LELVGFAI ASTTMLVEKSLSVI LLWALAMLI I DLRMKS FLAI PNLVI FAVLLFFS S 216 

Qy 175 PVAF GVYVNYRWPK QSKIILKIGAWGGVLLLV 207 

I : I I I : I I I : : | : : | | : : | 

Db 217 LETPKNPIAFACFFICLITDPFLDIYFSGLSVTERWKPFLYRGRICRRLSWFAGMIELT 27 6 

Qy 208 VAVAGWLAKGS — WNSDITLLTISFIFPLIGHVTGFLLAL FTHQS 251 

: : : I I : I I I : I I : I I I I III:: 

Db 27 7 FFI LSAFKLRDTHLWYFVI PGFS I FGI FWMI CHI I - FLLTLWGFHTKLNDCHKVYFTHRT 335 

Qy 252 -WQRCRTISLETGAQNIQMCITMLQL S FTAEHLVQMLS FPLAYGLFQLI DGFLI VAA 307 

: | I : : I : II I I : : : I : I : I : I I I I 

Db 336 DYNSLDRIMASKGMRH— FCLISEQLVFFSLLATAILGAVSWQPTNGIF— LSMFLIVLP 391 

Qy 308 YQTYKRRLKNKHG 320 

:: I :: I 

Db 392 LESMAHGLFHELG 4 04 



Search completed: March 23, 2004, 14:38:24 
Job time : 24 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM protein - protein search, using sw model 
Run on: 



Title: 

Perfect score: 
Sequence : 



March 23, 2004, 14:33:32 ; Search time 20 Seconds 

(without alignments) 
1813.210 Million cell updates/sec 

US-10-091-628-2 
1979 

1 MRANCSSSSACPANSSEEEL PGPMDCHRALEPVGHITSCE 377 



Scoring table: BLOSUM62 

Gapop 10.0 , Gapext 0.5 

Searched: 283366 seqs, 96191526 residues 

Total number of hits satisfying chosen parameters: 



283366 



Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



Database : PIRJ78:* 
1: pirl:* 
2: pir2:* 
3: pir3:* 
4: pir4:* 

Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 
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ALIGNMENTS 



RESULT 1 
A49876 

Na+-dependent bile acid transporter, ileal - golden hamster 
C; Species: Mesocricetus auratus (golden hamster) 

C;Date: 30-Jun-1995 #sequence_revision 30-Jun-1995 #text_change 05-Nov-1999 
C; Accession: A4 9876 

R;Wong, M.H.; Oelkers, P.; Craddock, A.L.; Dawson, P. A. 
J. Biol. Chem. 269, 1340-1347, 1994 

A; Title: Expression cloning and characterization of the hamster ileal sodium- 
dependent bile acid transporter. 

A;Reference number: A49876; MUID: 94117449; PMID:8288599 
A;Accession: A49876 
A; Status: preliminary 
A;Molecule type: mRNA 
A; Residues: 1-34 8 <W0N> 

A;Cross-references: GB:U02028; NID:g455032; PIDN: AAA18640 . 1 ; PID:g455033 
C; Keywords: transmembrane protein 



Query Match 44.7%; Score 884; DB 2; Length 348; 

Best Local Similarity 46.9%; Pred. No. 2.5e-65; 



Matches 164; Conservative 74; Mismatches 102; Indels 10; Gaps 4; 



Qy 7 SSSACPANSS — EEELPVGLEVHGN — LELVFTWSTVMMGLLMFSLGCSVEIRKLWSHI 62 

: I I I I : : I : : I : I I : I : I I : : : I : I I I : I I : I I : I I : 

Db 3 NSSICNPNATICEGDSCIAPESNFNAILSWMSTVLTILLALVT^FSMGCNVELHKFLGHL 62 

Qy 63 RRPWGIAVGLLCQFGLMPFTAYLLAI S FSLKPVQAIAVLIMGCCPGGTI SNI FTFWVDGD 122 

I I I I II II llllhll I : : I : : : I : I I I I : III I I I I I I I III : I I I I I 
Db 63 RRPWGIWGFLCQFGIMPLTGFVLSVAFGILPVQAVWLIQGCCPGGTASNILAYWVDGD 122 

Qy 123 MDLSISMTTCSTVAALGMMPLCIYLYTWSWSLQQNLTIPYQNIGITLVCLTIPVAFGVYV 182 

I I I I : I I I I I I I : I I I I I I I I ::: I I I : I I I : I I : I I I I I I : I : I I 

Db 123 MDLSVSMTTCSTLLALGMMPLCLFI YTKMWVDSGT I VI P YDS I GTS LVALVI PVS I GMYV 182 

Qy 183 NYRWPKQSKIILKIGAWGGVLLLWAVAGWLAKGSWNSDITLLTISFIFPLIGHVTGF 242 

I : : I I : : : I I I I I I I : : I : I : : : : I I I : I : : I : I I I : I : I : II 
Db 183 NHKWPQKAKIILKIGSTAGAILIVLIAWGGILYQSAWTIEPKLWIIGTIYPIAGYGLGF 242 

Qy 243 LLALFTHQSWQRCRTISLETGAQNIQMCITMLQLSFTAEHLVQMLSFPLAYGLFQLIDGF 302 

II I I I I I I :: I I II I I I : I I :: I I I I : I I : : I I I I : I I : 

Db 243 FLARIAGQPWYRCRTVALETGLQNTQLCSTIVQLSFSPEDLNLVFTFPLIYSIFQIAFAA 302 

Qy 303 LIVAAYQTYKRRLKNKHGKKNSGCTEVCHTRKS — TSSRETNAFLEVNEE 350 

: : : I I I I : INI: I M MM : M : 

Db 303 I LLGAYVAYKK CHGKNNTELQEKTDNEMEPRSSFQETNKGFQPDEK 348 



RESULT 2 
138655 

ileal sodium-dependent bile acid transporter - human 
C; Species: Homo sapiens (man) 

C;Date: Ol-Mar-1996 #sequence_revision 01-Mar-1996 #text_change 21-Jul-2000 
C;Accession: 138655 

R;Wong, M.H.; Oelkers, P.; Dawson, P. A. 
J. Biol. Chem. 270, 27228-27234, 1995 

A;Title: Identification of a mutation in the ileal sodium-dependent bile acid 

transporter gene that abolishes transport activity. 

A;Reference number: 138655; MUID : 96070831 ; PMID:7592981 

A;Accession: 138655 

A; Status: preliminary 

A;Molecule type: mRNA 

A; Residues: 1-348 <RES> 

A;Cross-references: EMBL:U10417; NID : g2623285 ; PIDN : AAC5187 0 . 1 ; PID:g595399 
A; Experimental source: Crohn's disease patient (heterozygous) 

A;Note: the wild type is shown; a form with 290-Ser was deficient in transport 
activity 
C; Genetics : 
A;Gene: SLC15-A2 

Query Match 43.5%; Score 860.5; DB 2; Length 348; 

Best Local Similarity 45.6%; Pred. No. 2.2e~63; 

Matches 160; Conservative 68; Mismatches 104; Indels 19; Gaps 4; 

Qy 5 CSSSSACPANSSEEELPVGLEVHGNLELVFTWSTVMMGLLMFSLGCSVEIRKLWSHIRR 64 

MM I : : I M : I I : : : I M I I M I M I I M MM 

Db 14 CSGASCWPESNFNNI L SWT. S T VLT I LLALVMFSMGCNVE I KK FLGH I KR 64 



Qy 65 PWGI AVGLLCQFGLMP FTAYLLAI S FS LKPVQAI AVLIMGCC PGGT I SNI FT FWVDGDMD 124 

I I I I I I I I I I | : I | I : : I : : : I : I : I I : I I I : I I I I I I I I I I MINIM 
Db 65 PWGI CVGFLCQFGIMPLTGFI LSVAFDI LPLQAVWLI I GCCPGGTASNI LAYWVDGDMD 124 

Qy 125 LSISMTTCSTVAALGMMPLCIYLYTWSWSLQQNLTIPYQNIGITLVCLTIPVAFGVYVNY 184 

11:1111111: MINI II: Ml I : : I I I I II M I I M I : I : M I : 

Db 125 LSVSMTTCSTLLALGMMPLCLLI YTKMWVDSGS I VI PYDNI GTSLVALWPVS IGMFVNH 184 

Qy 185 RWPKQSKIILKIGAWGGVLLLWAVAGWLAKGSWNSDITLLTISFIFPLIGHVTGFLL 244 

: I I : : M I I I I I I : : I M : : : M I I M : M I I 111= I = MM 

Db 185 KWPQKAKIILKIGSIAGAILIVLIAWGGILYQSAWIIAPKLWIIGTIFPVAGYSLGFLL 244 

Qy 245 ALFTHQSWQRCRTISLETGAQNIQMCITMLQLSFTAEHLVQMLSFPLAYGLFQLIDGFLI 304 

I I I I I I : : I M I I I M I : M M M I I : MM I MM : 

Db 245 ARIAGLPWYRCRTVAFETGMQNTQLCSTIVQLSFTPEELNWFTFPLIYSIFQLAFAAIF 304 

Qy 305 VAAYQTYKRRLKNKHGKKNSGCTEVCHTRKSTSSRETNAFLEVNEEGAITP 355 

: I I I : III: I I : : M : I I 1 

Db 305 LGFYVAYKK CHGKNKAEIPE SKENGTEPESSFYKAN--GGFQP 345 



RESULT 3 
A41601 

Na+/taurocholate transport protein - rat 
C; Species: Rattus norvegicus (Norway rat) 

C;Date: 30-Jun-1992 #sequence_revision 30-Jun-1992 #text_change 16-Feb-1997 
C; Accession: A41601 

R;Hagenbuch, B.; Stieger, B. ; Foguet, M. ; Luebbert, H. ; Meier, P.J. 
Proc. Natl. Acad. Sci. U.S.A. 88, 10629-10633, 1991 

A; Title: Functional expression cloning and characterization of the hepatocyte 
Na(+)/bile acid cotransport system. 

A; Reference number: A41601; MUID: 92073340 ; PMID: 1961729 

A;Accession: A41601 

A; Status: preliminary 

A; Molecule type: mRNA 

A; Residues: 1-362 <HAG> 

A; Cross-references : GB :M7742 9 

C; Keywords: transmembrane protein 

Query Match 28.3%; Score 559.5; DB 2; Length 362; 

Best Local Similarity 37.2%; Pred. No. 1.4e-38; 

Matches 133; Conservative 69; Mismatches 135; Indels 21; Gaps 9; 

Qy 10 ACPANSSEEELPVGLEVTiGNLELVFTWSTV>IMGLLMFSLGCSVEIRKLWSHIRRPWGIA 69 

: I I I II I I : : : : : I : I M II I I : M | : : I : : I I : 
Db 7 SAPFNFS LPPGFG-HRATDKALSIILVLMLLLIMLSLGCTMEFSKIKAHLWKPKGVI 62 

Qy 70 VGLLCQFGLMPFTAYLLAISFSLKPVQAIAVLIMGCCPGGTISNIFTFWVDGDMDLSISM 129 

I I : II I M I I M I II : : I M M I II I II M I M I : I I I M I I I 

Db 63 VALVAQFGIMPLAAFLLGKIFHLSNIE7VLAILICGCSPGGNLSNLFTLAMKGDMNLSIVM 122 

Qy 130 TTCSTVAALGMMPLCIYLYT WSWS LQQNLT I P YQN I GI TLVCLT I PVAFGVYVN YRW 186 

II II : M I I I I I I M M : : I : M I : I I M I : I I I : : : 

Db 123 TTCSSFSALGMMPLLLYVYSKGIYDGDLKDK— VPYKGIMISLVIVLIPCTIGIVLKSKR 180 

Qy 187 PKQSKIILKI GAWG G VL L L WAVAG WLAK G S WN S D I T — LLTISFIFPLIGHVTGFLL 244 

I I I I I : : M : I I : I M I I I : I I : I : M 



181 PH YVP YI LKGGMI I T FLL S VAVTALSVINVGN S IMFVMT PHLLAT S S LMP FS GFLMGYI L 240 

245 -ALFTHQSWQRC-RTISLETGAQNIQMCITMLQLSFTAEHLVQMLSFPLAYGLFQLIDGF 302 

Ml I | | || | : I I I I I I I : I I : I :: I I : : I I I I : I I I : I 

241 SALF — QLNPSCRRTISMETGFQNIQLCSTILNVTFPPEVIGPLFFFPLLYMI FQLAEGL 298 

303 LIVAAYQTYKRRLKNKHGKKNSGCTEVCHTRKSTSSRETNAFLEVNEEGAITPGPPGP 360 

| |: :: I:: | |:: : :| I : I I I I I I 

299 LIIIIFRCYEKI KPPKDQTKITYKAAATEDATPAALEKGTHNGNIPPLQPGP 350 



RESULT 4 
155601 

Na/taurocholate cotransporting polypeptide - human 
C; Species: Homo sapiens (man) 

C;Date: 02-Jul-1996 #sequence_revision 02-Jul-1996 #text_change 05-Nov-1999 

C;Accession: 155601 

R;Hagenbuch, B. ; Meier, P.J. 

J. Clin. Invest. 93, 1326-1331, 1994 

A;Title: Molecular cloning, chromosomal localization, and functional 
characterization of a human liver Na/bile acid cotransporter . 
A;Reference number: 155601; MUID : 94179485 ; PMID:8132774 
A;Accession: 155601 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A;Molecule type: mRNA 
A; Residues: 1-349 <RES> 

A;Cross-references: GB:L21893; NID:g410213; PIDN: AAA36381 . 1 ; PID:g410214 
C; Genetics : 

A; Gene: GDB: SLC10A1; NTCP 

A; Cross-references: GDB: 344932; OMIM: 182396 
A; Map position: 14pter-14qter 

Query Match 27.9%; Score 553; DB 2; Length 349; 

Best Local Similarity 36.0%; Pred. No. 4.6e-38; 

Matches 124; Conservative 77; Mismatches 109; Indels 34; Gaps 10; 



Qy 31 ELVFTVVSTVMMGLLMFSLGCSVEI RKLWSHI RRPWGI AVGLLCQFGLMP FTAYLLAI S F 90 

: I : | : | : : I I I I I : : I I : : I : : I I : I : I : I : I : I I I I : : I I 

Db 24 DLALSVILVFMLFFIMLSLGCTMEFSKIPCAHLWKPKGLAI7VLVAQYGIMPLTAFVLGKVF 83 

Qy 91 SLKPVQAI AVLIMGCCPGGT I SNI FTFWVDGDMDLS I SMTTCSTVAALGMMPLCI YLYT- 14 9 



84 RLKNIEALAI LVCGCSPGGNLSNVFSLAMKGDMNLSIVMTTCSTFCALGMMPLLLYIYSR 143 

150 — WSWSLQQNLT I PYQNI GI TLVCLTI PVAFGVYVNYRWPKQSKI I LKI GAWGGVLLLV 207 

: | : : I I : I I : I I : I I I : : : I : : :: I I : I : 

144 GIYDGDLKDK — VPYKGIVISLVLVLIPCTIGIVLKSKRPQYMRYVIKGGMII ILL 197 

208 VAVAG WLAKG S WN S D I - TLLTISFIFPLIGHWGFLL-ALFTHQSWQRC-RTIS 259 



ill II- • i i ■ * i • • i iii i 

Db 198 CSVAVTVLSAINVGKSIMFAMTPLLIATSSLMPFIGFLLGYVLSALFCLNG — RCRRTVS 255 

Qy 260 LETGAQNIQMCITMLQLSFT7VEHLVQMLSFPLAYGLFQLIDGFLIVAAYQTYKRRLKNKH 319 

: | | | | | : I : I I : I : : I I : : I I I I : I I I : I I : : I : h : I 

Db 256 METGCQNVQLCSTILNVAFPPEVIGPLFFFPLLYMIFQLGEGLLLIAIFWCYE-KFKTPK 314 

Qy 320 GKKNSGCTEVCHTRKSTSSRETNAFLEVNEEGAITPGPPGPMDC 363 



I I :: : I : I I I I : I II 

Db 315 DK TKMIYTAATT EETI PGALGNGTYKGEDC 344 



RESULT 5 
S01696 

gene P3 protein - human 

C; Species: Homo sapiens (man) 

C;Date: 30-Jun-1989 #sequence_revision 30-Jun~1989 #text_change 05-Nov~1999 

C; Accession: SO 1696 

R;Alcalay, M. ; Toniolo, D. 

Nucleic Acids Res. 16, 9527-9543, 1988 

A;Title: CpG islands of the X chromosome are gene associated. 
A;Reference number: S01696; MUID : 8904154 8 ; PMID:3186440 
A/Accession: S01696 
A;Molecule type: DNA 
A; Residues: 1-477 <ALC> 

A;Cross-references: EMBL : X12458 ; NID:g35187; PIDN : CAA30998 . 1 ; PID:g35188 

Query Match 16.9%; Score 333.5; DB 2; Length 477; 

Best Local Similarity 31.5%; Pred. No. 7.6e-20; 

Matches 87; Conservative 53; Mismatches 115; Indels 21; Gaps 3; 

Qy 12 PANSSEEELPVGLEVliGNLELVFTWSTVl^GLmFSLGCSVEIRKLWSHIRRPWGIAVG 71 

||: | | ::::::: I I I I I : I : : I : : I 

Db 172 PAEDTPATLSADLAHFSENPILYLLLPLIFVNKCSF--GCKVELEVLKGLMQSPQPMLLG 22 9 

Qy 72 LLCQFGLMPFTAYLLAI S FSLKPVQAIAVLIMGCCPGGTI SNI FTFWVDGDMDLS I SMTT 131 

| | | | : | | I : I : I II I : : : I III I : I : : I I : I : I I I I 

Db 230 LLGQFLVlVIPLYAFLMAKVFMLPKAIJU.GLIITCSSPGGGGSYLFSLLLGGDvTLAISMTF 289 

Qy 132 CSTVAALGMMPLCIYLYTWSWSLQQNLTIPYQNIGITLVCLTIPVAFGVYVNYRWPKQSK 191 

II || I I : M : I : I : : I : I I I I : : I I : I I I : : I I I : 

Db 290 LSTVAATGFLPLSSAIYSRLLSIHETLHVPISKILGTLLFIAIPIAVGVLIKSKLPKFSQ 34 9 

Qy 192 1 1 LKI GAWGGVLLL WAVAGWLAKGSWNSDITLLTISFIFPLIGHVTG 241 

: : I : : I I I I I : I I : I : = : I I : I : I 

Db 350 LLLQWKP FS FVLLLGGLFLAYRMGVFI LAGI RL PIVLVGITVPLVGLLVG 400 

Qy 242 FLLALFTHQSWQRCRTISLETGAQNIQMCITMLQLS 277 

: I I : I I : I : I I I I : : I I I I I 

Db 401 YCLATCLKLPVAQRRTVSIEVGVQNSLLALAMLQLS 436 



RESULT 6 
E69902 

probable sodium-dependent transporter yocS - Bacillus subtilis 
C; Species: Bacillus subtilis 

C;Date: 05-Dec-1997 #sequence_revision 05-Dec-1997 #text_change 20-Jun-2000 
C; Accession: E69902 

R;Kunst, F.; Ogasawara, N. ; Moszer, I.; Albertini, A.M.; Alloni, G.; Azevedo, 
V.; Bertero, M.G.; Bessieres, P.; Bolotin, A.; Borchert, S.; Boriss, R. ; 
Boursier, L.; Brans, A.; Braun, M. ; Brignell, S.C.; Bron, S.; Brouillet, S.; 
Bruschi, C.V. ; Caldwell, B.; Capuano, V.; Carter, N.M.; Choi, S.K.; Codani, 
J. J.; Connerton, I.F.; Cummings, N.J.; Daniel, R.A. ; Denizot, F. ; Devine, K.M. ; 
Duesterhoeft, A.; Ehrlich, S.D.; Emmerson, P.T.; Entian, K.D.; Errington, J.; 
Fabret, C. ; Ferrari, E. 



Nature 390, 249-256, 1997 

A; Authors : Foulger, D . ; Fritz, C. ; Fujita, M. ; Fujita, Y. ; Fuma, S.; Galizzi, 

A. ; Galleron, N . ; Ghim, S.Y.; Glaser, P.; Goffeau, A.; Golightly, E.J.; Grandi, 
G.; Guiseppi, G. ; Guy, B.J.; Haga, K. ; Haiech, J.; Harwood, C.R.; Henaut, A. ; 
Hilbert, H. ; Holsappel, S.; Hosono, S.; Hullo, M.F.; Itaya, M. ; Jones, L. ; 
Joris, B.; Karamata, D . ; Kasahara, Y . ; Klaerr-Blanchard, M. ; Klein, C; 
Kobayashi, Y. ; Koetter, P.; Koningstein, G.; Krogh, S.; Kumano, M. ; Kurita, K. ; 
Lapidus, A.; Lardinois, S. 

A; Authors: Lauber, J.; Lazarevic, V.; Lee, S.M.; Levine, A.; Liu, H.; Masuda, 
S.; Maueel, C; Medigue, C; Medina, N. ; Mellado, R.P.; Mizuno, M. ; Moestl, D.; 
Nakai, S.; Noback, M. ; Noone, D.; O'Reilly, M. ; Ogawa, K. ; Ogiwara, A.; Oudega, 

B. ; Park, S.H.; Parro, V.; Pohl, T.M.; Portetelle, D.; Porwolik, S.; Prescott, 
A.M.; Presecan, E.; Pujic, P.; Purnelle, B.; Rapoport, G. ; Rey, M. ; Reynolds, 
S.; Rieger, M. ; Rivolta, C; Rocha, E . ; Roche, B.; Rose, M. ; Sadaie, Y.; Sato, 
T . ; Scanlon, E. 

A; Authors: Schleich, S.; Schroeter, R. ; Scoff one, F. ; Sekiguchi, J.; Sekowska, 
A.; Seror, S.J.; Serror, P.; Shin, B.S.; Soldo, B. ; Sorokin, A.; Tacconi, E.; 
Takagi, T.; Takahashi, H. ; Takemaru, K. ; Takeuchi, M. ; Tamakoshi, A.; Tanaka, 
T.; Terpstra, P.; Tognoni, A.; Tosato, V.; Uchiyama, S.; Vandenbol, M. ; Vannier, 
F.; Vassarotti, A.; Viari, A.; Wambutt, R. ; Wedler, E . ; Wedler, H.; 
Weitzenegger, T . ; Winters, P.; Wipat, A.; Yamamoto, H. ; Yamane, K. ; Yasumoto, 
K.; Yata, K. ; Yoshida, K. 

A/Authors: Yoshikawa, H.F.; Zumstein, E. ; Yoshikawa, H.; Danchin, A. 

A; Title: The complete genome sequence of the Gram-positive bacterium Bacillus 

subtilis. 

A; Reference number: A69580; MUID : 98044033 ; PMID: 9384377 
A; Accession: E69902 

A; Status: preliminary; nucleic acid sequence not shown; translation not shown 
A; Molecule type: DNA 
A; Residues: 1-321 <KUN> 

A;Cross-references: GB:Z99114; GB:AL009126; NID : g2634230 ; PIDN : CAB13827 . 1 ; 
PID:g2634328 

A; Experimental source: strain 168 
C; Genetics : 
A; Gene: yocS 

C;Superfamily: Bacillus subtilis sodium-dependent transporter yocS 

Query Match 16.4%; Score 325; DB 2; Length 321; 

Best Local Similarity 27.6%; Pred. No. 2.5e-19; 

Matches 84; Conservative 76; Mismatches 114; Indels 30; Gaps 12; 

VFTWS TVMMGLLMFSLGCSVEIRKLWSHIRRPWGIAVGLLCQFGLMPFTAYLLAIS 8 9 

: | | : | | : : | : : | | : | : : : : I : II : : I : : I : : II I : I I 



111111111:11 I: ll:::|| I h I : II I 



Qy 


33 


Db 


32 


Qy 


90 


Db 


92 


Qy 


150 


Db 


152 


Qy 


206 


Db 


203 



'GVYVNYRWPKQ- SKI I — LKI GAWGGVLL 205 
I : I : | | : | : I : : I : I 
.GLIVKMFFRKOVAKAVHALPLVSVIG 202 



|: |:: I | : : :: : II:: III I 



Qy 263 GAQNIQMCITMLQLSFTAEHLVQMLSFPLA-YGLFQLIDGFLIVAAYQTYKRRLKNKH-G 320 

| | | : | : I : : | | : :: : I :: I I : ::: I I I 

Db 262 GMQN SGLGAALATAHFSPLSAVPSAIFSVWHNLSGSML-ATY — WSKKVKKKQAG 313 

Qy 321 KKNS 324 

I : I 

Db 314 SKSS 317 



RESULT 7 
D90031 

hypothetical protein SA2112 [imported] - Staphylococcus aureus (strain N315) 
C; Species: Staphylococcus aureus 

C;Date: 10-May-2001 #sequence_revision 10-May-2001 #text_change 22-Oct-2001 
C; Accession: D90031 

R;Kuroda, M. ; Ohta, T.; Uchiyama, I.; Baba, T.; Yuzawa, H.; Kobayashi, I.; Cui, 
L.; Oguchi, A.; Aoki, K. ; Nagai, Y. ; Lian, J.; Ito, T.; Kanamori, M. ; Matsumaru, 
H.; Maruyama, A.; Murakami, H.; Hosoyama, A.; Mizutani-Ui, Y.; Kobayashi, N. ; 
Sawano, T. ; Inoue, R. ; Kaito, C. ; Sekimizu, K. ; Hirakawa, H.; Kuhara, S.; Goto, 
S.; Yabuzaki, J.; Kanehisa, M. ; Yamashita, A.; Oshima, K. ; Furuya, K.; Yoshino, 
C.; Shiba, T . ; Hattori, M. ; Ogasawara, N . ; Hayashi, H.; Hiramatsu, K. 
Lancet 357, 1225-1240, 2001 

A;Title: Whole genome sequencing of meticillin-resistant Stapylococcus aureus. 

A; Reference number: A89758; MUID : 21311952 ; PMID : 11418146 

A; Accession: D90031 

A; Status : preliminary 

A;Molecule type: DNA 

A; Residues: 1-305 <KUR> 

A;Cross-references: GB:BA000018; PID: gl3702121; PIDN : BAB43413 . 1 ; GSPDB: GN00149 

A; Experimental source: strain N315 

C; Genetics : 

A; Gene: SA2112 

C;Superfamily: Bacillus subtilis sodium-dependent transporter yocS 

Query Match 15.2%; Score 301.5; DB 2; Length 305; 

Best Local Similarity 27.9%; Pred. No. 2.1e-17; 

Matches 68; Conservative 60; Mismatches 101; Indels 15; Gaps 4; 

IMGLLMFSLGCSVEIRKLWSHIRRPWGIAVGLLCQFGLMPFTAYLLAISFSLKPVQAIAV 100 
: | : : I : I : : : I : : I : I I : I I I : : : I I I i I I : I 



| M | I I I I I : : : : : I I : I : I I I I : I : I III: 



Qy 


41 


Db 


41 


Qy 


101 


Db 


101 


Qy 


161 


Db 


161 


Qy 


215 


Db 


217 



1 1 PVAFGV YVNYRWPKQSKI I LKI GAWGGVLLLWAVAG W 214 

I I : I I : : I : : : : I I : I I I : I : I I 



| | : : : : : : | : I : II I : : I : I I I 



Qy 

Db 



275 QLSF 278 
I I 

272 ALHF 275 



RESULT 8 
AD3295 

sodium/bile acid cotransporter homolog, sbf family BMEI0346 [imported] - 
Brucella melitensis (strain 16M) 
C; Species: Brucella melitensis 

C;Date: 01-Feb-2002 #sequence_revision Ol-Feb-2002 #text_change 15-Feb-2002 
C;Accession: AD3295 

R;DelVecchio, V.G.; Kapatral, V.; Redkar, R.J.; Patra, G.; Mujer, C; Los, T.; 
Ivanova, N. ; Anderson, I.; Bhattacharyya, A.; Lykidis, A.; Reznik, G. ; 
Jablonski, L . ; Larsen, N. ; D'Souza, M. ; Bernal, A.; Mazur, M. ; Goltsman, E . ; 
Selkov, E. ; Elzer, P.H.; Hagius, S.; O'Callaghan, D.; Letesson, J. J.; Haselkorn, 
R. ; Kyrpides, N. ; Overbeek, R. 

Proc. Natl. Acad. Sci. U.S.A. 99, 443-448, 2002 

A;Title: The genome sequence of the facultative intracellular pathogen Brucella 
melitensis . 

A; Reference number: AD3252; PMID : 11756688 
A;Accession: AD3295 
A; Status : preliminary 
A; Molecule type: DNA 
A; Residues: 1-318 <KUR> 

-A; Cross-references: GB:AE008917; PIDN : AAL51527 . 1 ; PID: gl7982244 ; GSPDB : GN00190 
A; Experimental source: strain 16M 
C; Genetics : 
A; Gene: BMEI0346 
A; Map position: I 

C;Superfamily: Bacillus subtilis sodium-dependent transporter yocS 

Query Match 15.1%; Score 299.5; DB 2; Length 318; 

Best Local Similarity 27.9%; Pred. No. 3.2e-17; 



Matches 79; Conservative 66; Mismatches 117; Indels 21; Gaps 7; 

Qy 33 VFTWSTVT^MGLLMFSLGCSVEIRKLWSHIRRPWGIAVGLLCQFGLMPFTAYLLAISFSL 92 

: | | : : | : : | | : | : : : : I I : : I : I : I I I : I I III : 

Db 35 I FAPWI WLLGI IMFGMGLTI SGKDFAEVAKRPFDVAI GVLAQFI IMPLLAVLLTRI I PM 94 

Qy 93 KPVQAIAVLIMGCCPGGTISNIFTFWVDGDMDLSISMTTCSTVA7VLGMMPLCIYLYTWSW 152 

I I I : : : I I I I I I I I I : I : I I : I I : : I : : I : I : I : : : 
Db 95 S P EVAAGVT LVG C C P GGT AS NVMT YP S KG DVAL S VACT S VT T L LAP LVT P FLVW F FA 151 

Qy 153 SLQQNLTIPYQNIGITLV-CLTIPVAFGVYVNYRWP KQSKIILKIGAWGGVLLL— 206 

||: : : I : : I : : I : I I : I I : : I : : I I I I I : : 
Db 152 — SQYLPVDAMSMFI S I VKVI LVPIALGFVLQKLVPG VVKAAVPMLPLVSVVGI VLIVAA 209 

Qy 207 WAVAGWLAKGSWNSDITLLTISFIFPLIGHVTGFLLALFTHQSWQRCRTISLETGAQN 266 

| | | | : | : I : : : : I : I : I I I : : I I : I I I I 
Db 210 WAVNKAAIAQ SGLLI FAVWLHNCCGLLLGYFAARFAGLSLAKRKAI S I EVGMQN 265 

Qy 267 IQMCITMLQLSFTAEHLVQMLSFPLA-YGLFQLIDGFLIVAAY 308 

: | : I : : I I : : : I I I : : I 

Db 266 S GLGAALANAH FS P LAAVP S AVFS VWHN I S GAL VAS YY 303 



RESULT 9 
B83757 



sodium-dependent transporter BH0858 [imported] - Bacillus halodurans (strain C- 
125) 

C; Species: Bacillus halodurans 

C;Date: Ol-Dec-2000 #sequence_revision Ol-Dec-2000 #text_change 15-Jun-2001 
C;Accession: B83757 

R;Takami, H.; Nakasone, K. ; Takaki, Y. ; Maeno, G. ; Sasaki, R. ; Masui, N. ; Fuji, 
F.; Hirama, C. ; Nakamura, Y. ; Ogasawara, N . ; Kuhara, S.; Horikoshi, K. 
Nucleic Acids Res. 28, 4317-4331, 2000 

A; Title: Complete genome sequence of the alkaliphilic bacterium Bacillus 

halodurans and genomic sequence comparison with Bacillus subtilis. 

A; Reference number: A83650; MUID: 20512582 ; PMID : 11058132 

A;Accession: B83757 

A; Status : preliminary 

A; Molecule type: DNA 

A; Residues: 1-323 <STO> 

A;Cross-references: GB:AP001510; GB:BA000004; NID: gl0173440; PIDN: BAB04577 . 1; 
GSPDB:GN00137 

A; Experimental source: strain C-125 

C; Genetics : 

A; Gene: BH0858 

C;Superfamily: Bacillus subtilis sodium-dependent transporter yocS 

Query Match 15.1%; Score 299.5; DB 2; Length 323; 

Best Local Similarity 24.8%; Pred. No. 3.2e-17; 

Matches 80; Conservative 80; Mismatches 122; Indels 41; Gaps 9; 

Qy 34 FTWS TVMMGLLMFSLGCSVEIRKLWSHIRRPWGIAVGLLCQFGLMPFTAYLLAISF 90 

||:: | : : : | : : | | : | : : : : : : : I : I I : I I I : I I I : I I : I 

Db 33 FTWITPHITILL G VI M FGM GLTLKLSDFRI VL Q K P I P VL VG VLAQ FVI M P L VAF ALAYAF 92 

Qy 91 SLKPVQAIAVLIMGCCPGGTISNIFTFWVDGDMDLSISMTTCSTVAALGMMPLCIYLYTW 150 

: | I | : : : : I I I I I I I I : : I : : I : : I I : I I : I : I : I 
Db 93 N L P P E LAAGLVLVGAC P G GT AS N VMVYLAKGNVAAS VAMT S VS TMLAP I VT P F I LL L LAG 152 

Qy 151 SWSLQQNLTIPYQNIGITLV-CLTIPVAFGVYVNYRWPK QSKIILKI G AWG G VL L L 206 

| || :::::: : : I : I I : : I I : I : | : : : I : : : 

Db 153 QW L P I DAKAMFVS I LQMI I VP I AL GL FVRKMAPN AVD KS T AVL P LVS I V- AIMAI 206 

Qy 207 WAVAGVVIAKGSWNSDITLLTISFIFPLIGHVTGFLLALFTHQSWQRCRTISLETGAQN 266 

| | | | | : : | : : : | : | : | | I I I I : I I I I 

Db 207 VSAWGANQANLMSGAALLFLAV-MLHNVFGLLLGYLTAKFVGLDESTRRAISIEVGMQN 265 

Qy 267 IQMCITMLQLSFTAEHLVQMLSFPLA-YGLFQLIDGFLIVAAYQTYKRRLKNKHGKKNSG 325 

: | : I : : | | : : : | I : : I : : 
Db 266 SGLGAALAGNHFSPLAALPSAIFSVWHNISGPVLVSIWS 304 

Qy 326 CTEVCHTRKSTSSRETNAFLEVN 348 

: II I : : : I : : I : 
Db 305 RSAKSAQKRQSDADMKVD 322 



RESULT 10 
T02645 

hypothetical protein At2g26900 [imported] - Arabidopsis thaliana 
N; Alternate names: hypothetical protein F12C20.6 
C; Species: Arabidopsis thaliana (mouse-ear cress) 

C;Date: 24-Mar-1999 #sequence_revision 24-Mar-1999 #text__change 16-Feb-2001 



C;Accession: T02645; C84666 

R;Rounsley, S.D.; Ronning, CM.; Lin, X.; Ketchum, K.A. ; Crosby, M.L.; Brandon, 
R.C.; Sykes, S.M.; Kaul, S.; Mason, T.M.; Kerlavage, A.R.; Adams, M.D.; 
Somerville, C.R.; Venter, J.C. 

submitted to the EMBL Data Library, August 1998 

A; Description: Arabidopsis thaliana chromosome II BAC F12C2 0 genomic sequence. 
A; Reference number: Z14685 
A;Accession: T02645 

A; Status: translated from GB/EMBL/DDB J 
A;Molecule type: DNA 
A; Residues: 1-338 <ROU> 

A;Cross-references: EMBL : AC005168 ; NID : g3426033; PID:g3426051 
A; Experimental source: cultivar Columbia 

R;Lin, X.; Kaul, S.; Rounsley, S.D.; Shea, T.P.; Benito, M.I.; Town, CD.; 
Fujii, C.Y.; Mason, T.M.; Bowman, C.L.; Barnstead, M.E.; Feldblyum, T.V.; Buell, 
C.R.; Ketchum, K.A. ; Lee, J.J.; Ronning, CM. ; Koo, H.; Moffat, K.S.; Cronin, 
L.A.; Shen, M. ; VanAken, S.E.; Umayam, L. ; Tallon, L.J.; Gill, J.E.; Adams, 
M.D.; Carrera, A.J.; Creasy, T.H.; Goodman, H.M.; Somerville, C.R.; Copenhaver, 
G.P.; Preuss, D. ; Nierman, W.C; White, 0.; Eisen, J. A. ; Salzberg, S.L.; Fraser, 
CM. ; Venter, J.C 
Nature 402, 761-768, 1999 

A; Title: Sequence and analysis of chromosome 2 of the plant Arabidopsis 
thaliana . 

A; Reference number: A84420; MUID : 20083487 ; PMID : 10617197 
A;Accession: C84666 
A; Status : preliminary 
A; Molecule type: DNA 
A; Residues: 1-338 <STO> 

A;Cross-references: GB:AE002093; NID : g3426051 ; PIDN : AAC32250 . 1 ; GSPDB : GN00139 
C; Genetics : 

A; Gene: At2g26900; F12C20.6 
A;Map position: 2 

A;Introns: 22/2; 61/3; 99/3; 120/3; 163/2; 190/3; 208/1; 240/3; 293/3 

Query Match 14.4%; Score 284; DB 2; Length 338; 

Best Local Similarity 25.8%; Pred. No. 6.4e-16; 

Matches 80; Conservative 54; Mismatches 120; Indels 56; Gaps 8; 

SAC PANS SEEELPVGLEVHGNL ELVFTV VSTVMM 42 

I I I : I I I : I I I = I : : I = : 



| | | | : I : : : I I I : I I I I : : I : I : I 



| | I I I I : I : I :: I I : I I I I I I : I : I I I I : I 

!C PGGQASNVAT YI S KGNVALS VLMTTCST I GAI IMT P LLT KLLAGQLVPV 187 

GI T L VC L T I P VA F G VYVN Y RW PKQSKIILKI G AWG G VL L LWA VAGV 213 

| : I : : I | | I : I I : I : : : : I : I : = III 



Qy 


9 


Db 


15 


Qy 


43 


Db 


75 


Qy 


103 


Db 


135 


Qy 


163 


Db 


188 


Qy 


214 


Db 


248 



-SWQRCRTISLETGAQNIQM 269 
I I I I : I I I : : 



Qy 270 CITMLQLSFT 279 

: I II 

Db 298 GFLLAQKHFT 307 



RESULT 11 
F83236 

probable transporter PA3264 [imported] - Pseudomonas aeruginosa (strain PAOl) 
C; Species: Pseudomonas aeruginosa 

C;Date: 15-Sep-2000 #sequence_revision 15-Sep-2000 #text_change 31-Dec-2000 
C;Accession: F83236 

R;Stover, C.K.; Pham, X.Q.; Erwin, A.L.; Mizoguchi, S.D.; Warrener, P.; Hickey, 
M.J.; Brinkman, F.S.L.; Hufnagle, W.O.; Kowalik, D.J.; Lagrou, M. ; Garber, R.L.; 
Goltry, L.; Tolentino, E. ; Westbrook-Wadman, S.; Yuan, Y. ; Brody, L.L.; Coulter, 
S.N.; Folger, K.R.; Kas, A.; Larbig, K. ; Lim, R.M. ; Smith, K.A.; Spencer, D.H.; 
Wong, G.K.S.; Wu, Z.; Paulsen, I.T.; Reizer, J.; Saier, M.H.; Hancock, R.E.W.; 
Lory, S.; Olson, M.V. 
Nature 406, 959-964, 2000 

A;Title: Complete genome sequence of Pseudomonas aeruginosa PAOl, an 
opportunistic pathogen. 

A;Reference number: A82950; MUID : 20437337 ; PMID : 10984043 
A; Accession: F83236 
A; Status : preliminary 
A; Molecule type: DNA 
A;Residues: 1-311 <STO> 

A;Cross-references: GB:AE004749; GB:AE004091; NID : g9949388 ; PIDN : AAG06652 . 1 ; 

GSPDB:GN00131; PASP:PA3264 

A; Experimental source: strain PAOl 

C; Genetics : 

A; Gene: PA3264 

C; Super family: Bacillus subtilis sodium-dependent transporter yocS 

Query Match 13.6%; Score 269.5; DB 2; Length 311; 

Best Local Similarity 25.7%; Pred. No. 9.2e-15; 

Matches 75; Conservative 74; Mismatches 116; Indels 27; Gaps 8; 

LVFTWSTVMMGLLMFSLGCSVEIRKLWSHIRRPWGIAVGLLCQFGLMPFTAYLLAISFS 91 
| | ::||:|| :| ::: II : :|:| II :|| |:|| 



I : | : : : I I I I I I I I I : I : I I ' I I : : : I : : I : I 



Qy 


32 


Db 


34 


Qy 


92 


Db 


94 


Qy 


152 


Db 


154 


Qy 


211 


Db 


208 


Qy 


265 


Db 


264 



|:| |: :::: : :: :| I :|| 



| | | | : : : I I : : : I : I : : I I I : I : : : : I I 



|| : | : | : : | I : :: : I I: I : III 

nM SGLGAALANAHFSPLAAVPSALFSVWHNLSGSLLAALF RRL 306 



RESULT 12 
B81168 

transporter NMB0705 [imported] - Neisseria meningitidis (strain MC58 serogroup 
B) 

C; Species: Neisseria meningitidis 

C;Date: 31-Mar-2000 #sequence_revision 31-Mar-2000 #text_change 19-Jan-2001 
C; Accession: B81168 

R;Tettelin, H. ; Saunders, N.J.; Heidelberg, J.; Jeffries, A.C.; Nelson, K.E.; 
Eisen, J. A.; Ketchum, K.A. ; Hood, D.W.; Peden, J.F.; Dodson, R.J.; Nelson, W.C. 
Gwinn, M.L.; DeBoy, R. ; Peterson, J.D.; Hickey, E.K.; Haft, D.H.; Salzberg, 
S.L.; White, 0.; Fleischmann, R.D.; Dougherty, B.A.; Mason, T.; Ciecko, A.; 
Parksey, D.S.; Blair, E.; Cittone, H. ; Clark, E.B.; Cotton, M.D.; Utterback, 
T.R.; Khouri, H.; Qin, H- ; Vamathevan, J.; Gill, J.; Scarlato, V.; Masignani, 
V. ; Pizza, M. 

Science 287, 1809-1815, 2000 

A;Authors: Grandi, G. ; Sun, L . ; Smith, H.O.; Fraser, CM. ; Moxon, E.R.; 
Rappuoli, R.; Venter, J.C. 

A; Title: Complete genome sequence of Neisseria meningitidis serogroup B strain 
MC58. 

A; Reference number: A81000; MUID : 20175755 ; PMID : 107 10307 
A;Accession: B81168 
A; Status: preliminary 
A;Molecule type: DNA 
A; Residues: 1-315 <TET> 

A;Cross-references: GB:AE002425; GB:AE002098; NID : g7225930 ; PIDN : AAF41122 . 1 ; 

PID:g7225934; GSPDB: GN00119; TIGR:NMB0705 

A; Experimental source: serogroup B, strain MC58 

C; Genetics : 

A; Gene: NMB07 05 

C; Super family: Bacillus subtilis sodium-dependent transporter yocS 

Query Match 13.5%; Score 266.5; DB 2; Length 315; 

Best Local Similarity 26.6%; Pred. No. 1.6e-14; 



Matches 


81; Conservative 65; Mismatches 102; Indels 57; Gaps 


10; 


Qy 


41 


MMGLLMFSLGCSVEIRKLWSHIRRPWGIAVGLLCQFGLMPFTAYLLAISFSLKPVQAIAV 

: : | : : | | : I : : : : 1 : : | : : I I : 1 1 1 1 : 1 1 : : 1 hi 
LLGIIMFGMGLTLKPSDFDILFKHPKWIIGVIAQFAIMPATAWLLSKLLNLPAEIAVGV 


100 


Db 


43 


102 


Qy 


101 


LIMGCCPGGTISNIFTFWVDGDMDLSISMTTCSTVAALGMMPLCIYLYTWSWSLQQNLTI 
: : : I I M I 1 1 1 1 : 1 : | : : | | : : : I : I I : : : 1 I : I : 1 1 
I LVGC C P G GT AS NVMT YLARGNVAL S VAVT SVSTLISPLLTP-AI FLML AG EML E I 


160 


Db 


103 


157 


Qy 


161 


PYQNIGITLV-CLTIPVAFGVYVNYRWPKQSK IILKIGAWG 

: : : : | : : | : | : I : : : : h 1 1 1 1 1 1 1 
Q AAGMLM S I VKMVL L P I VLGL I VH KVL G S KT E KLT DAL P LVS VAAI VL 1 1 GAWGAS KGK 


201 


Db 


158 


217 


Qy 


202 


----GVLLLWAVAGWIAKGSWNSDITLLTISFIFPLIGHWGFLLALFTHQSWQRCRT 

|:h 1 Nil lh: II 1 :l : =1 

IMESGLLIFAV WLHNG IGYLLGFFAAKWTGLPYDAQKT 


257 


Db 


218 


256 


Qy 


258 


ISLETGAQNIQMCITMLQLSFTAEHLVQMLSFPLA-YGLFQLIDGFLIVAAYQTYKRRLK 


316 


Db 


257 


: : : | I 1 1 : : II : 1 : | | : : : I I h II 

LT I EVGMQN S GLAAALAAAH FAAAPW AVP GAL FS VWHN I S GS LLA TYWAAKA 


309 


Qy 


317 


NKHGK 321 





310 GKHKK 314 



RESULT 13 
E81937 

probable transmembrane transport protein NMA0909 [imported] - Neisseria 
meningitidis (strain Z2491 serogroup A) 
C; Species: Neisseria meningitidis 

C;Date: 05-May-2000 #sequence_revision 05-May-2000 #text_change 02-Feb-2001 
C;Accession: E81937 

R;Parkhil'l, J.; Achtman, M. ; James, K.D.; Bentley, S.D.; Churcher, C; Klee, 
S.R.; Morelli, G . ; Basham, D.; Brown, D.; Chillingworth, T.; Davies, R.M. ; 
Davis, P.; Devlin, K. ; Feltwell, T . ; Hamlin, N. ; Holroyd, S.; Jagels, K. ; 
Leather, S.; Moule, S.; Mungall, K. ; Quail, M.A. ; Rajandream, M.A. ; Rutherford, 
K.M.; Simmonds, M. ; Skelton, J.; Whitehead, S.; Spratt, B.G.; Barrell, B.G. 
Nature 404, 502-506, 2000 

A;Title: Complete DNA sequence of a serogroup A strain of Neisseria menigitidis 
Z2491. 

A; Reference number: A81775; MUID: 20222556; PMID: 10761919 
A;Accession: E81937 
A; Status : preliminary 
A; Molecule type: DNA 
A; Residues: 1-315 <PAR> 

A; Cross-references: GB:AL162754; GB:AL157959; NID : g7379424 ; PIDN : CAB84186 . 1 ; 

PID:g7379621; GSPDB : GN00124 ; NMAS P : NMA0 9 0 9 

A; Experimental source: serogroup A, strain Z2491 

C; Genetics : 

A; Gene: NMA0909 

C;Superfamily: Bacillus subtilis sodium-dependent transporter yocS 

Query Match 13.4%; Score 265.5; DB 2; Length 315; 

Best Local Similarity 25.9%; Pred. No. 2e-14; 

Matches 79; Conservative 68; Mismatches 101; Indels 57; Gaps 10; 



QY 


41 


MMGLLMFSLGCSVEIRKLWSHIRRPWGIAVGLLCQFGLMPFTAYLLAISFSLKPVQAIAV 

: : | : : I I : I : : : : I : : 1 : : 1 1 : 1 1 1 1 : i 1 : : 1 hi 

LLGIIMFGMGLTLKPSDFDILFKHPKAVIIGVIAQFAIMPATAWLLSKLLNLPAEIAVGV 


100 


Db 


43 


102 


Qy 


101 


LIMGCCPGGTI SNI FTFWVDGDMDLSI SMTTCSTVAALGMMPLCI YLYTWSWSLQQNLTI 

: : : 1 1 1 1 1 1 1 1 1: 1 : I : : 1 1 : :: h 1 h : : 1 hi : 1 1 
ILVGCCPGGTASNVMTYLARGNVALSVAVTSVSTLISPLLTP-AIFLML AGEMLEI 


160 


Db 


103 


157 


Qy 


161 


PYQNIGITLV-CLTIPVAFGVYVNYRWPKQSK IILKIGAWG 

: : : : : | : : | : | : I : : : : 1 = 1 1 1 I 1 1 1 
QAGSMLMS IVKMVLLPIVLGLIVHKVLGSKTEKLTD7VLPLVSVAAIVLI I GAWGASKGK 


201 


Db 


158 


217 


Qy 


202 


GVLLLWAVAGWLAKGSWNSDITLLTISFIFPLIGHVTGFLIALFTHQSWQRCRT 

1 : 1 : 1 1 1 1 1 | | : : | | | : 1 : : 1 
IMESGLLI FAV WLHNG IGYLLGFFAAKWTGLPYDAQKT 


257 


Db 


218 


256 


Qy 


258 


ISLETGAQNIQMCITMLQLSFTAEHLVQMLSFPLA-YGLFQLIDGFLIVAAYQTYKRRLK 

: : : I 1 1 1 : : 1 : : : : I I : : : III: II 

LAI EVGMQN S G LAAALAAAH FA VAP WAVP GAL FSVWHN I S G S L LA T YWAAKA 


316 


Db 


257 


309 


Qy 


317 


NKHGK 321 





1 1 I 



Db 



310 GKHKK 314 



RESULT 14 
D83438 

probable transporter PA1650 [imported] - Pseudomonas aeruginosa (strain PAOl) 
C; Species: Pseudomonas aeruginosa 

C;Date: 15-Sep-2000 #sequence_revision 15-Sep-2000 #text_change 31-Dec-2000 
C; Accession: D83438 

R;Stover, C.K.; Pham, X.Q.; Erwin, A.L.; Mizoguchi, S.D.; Warrener, P.; Hickey, 
M.J.; Brinkman, F.S.L.; Hufnagle, W.O.; Kowalik, D.J.; Lagrou, M. ; Garber, R.L.; 
Goltry, L. ; Tolentino, E. ; Westbrook-Wadman, S.; Yuan, Y. ; Brody, L.L.; Coulter, 
S.N.; Folger, K.R.; Kas, A.; Larbig, K. ; Lim, R.M. ; Smith, K.A. ; Spencer, D.H.; 
Wong, G.K.S.; Wu, Z.; Paulsen, I.T.; Reizer, J.; Saier, M.H.; Hancock, R.E.W.; 
Lory, S.; Olson, M.V. 
Nature 406, 959-964, 2000 

A;Title: Complete genome sequence of Pseudomonas aeruginosa PAOl, an 
opportunistic pathogen. 

A;Reference number: A82950; ,MUID: 20437337; PMID: 10984043 
A;Accession: D83438 
A; Status: preliminary 
A; Molecule type: DNA 
A; Residues: 1-297 <STO> 

A;Cross-references : GB:AE004593; GB:AE004091; NID : g9947619 ; PIDN : AAG05039 . 1 ; 

GSPDB:GN00131; PASP:PA1650 

A; Experimental source: strain PAOl 

C; Genetics : 

A; Gene: PA1650 

C;Superfamily: Bacillus subtilis sodium-dependent transporter yocS 

Query Match 13.0%; Score 257.5; DB 2; Length 297; 

Best Local Similarity 26.2%; Pred. No. 8.5e-14; 

Matches 75; Conservative 60; Mismatches 128; Indels 23; Gaps 7; 

Qy 33 VFTWSTVMMGLLMFSLGCSVEIRKLWSHIRRPWGIAVGLLCQFGLMPFTAYLLAISFSL 92 

: I : : : I : : I III: : I I : I I I : I I I : I : I : hi 

Db 6 ILTLFLPIALGIIMLGLGLSLTPADFLRWRYPKPVLVGLVCQIVLLPLACFLIVQGFAL 65 

Qy 93 KPVQAIAVLIMGCCPGGTISNIFTFWVDGDMDLSISMTTCSTVAALGMMPLCIYLYTWSW 152 

: | : : : : : I I I I : I : : : | | : | : I : : I : : I I : I I I : I 
Db 66 EAALAVGMMLLAASPGGTTANLYSHLAHGDVALNITLTAVNSVIAILTMPLIVNL 120 

Qy 153 SLQ QN LT I P YQN I G I T LVC LT I P VAFGVYVN YRW P KQ S K 1 1 L KI GAWGGVLLL 206 

III | : : : : I : I I I I : I I I : : I :: : I I I 

Db 121 S LQ Y FMG D GQAI P LQ FG KWQ VFVI VL G P VAI GML VRN RL P AVAD RLQ K P VK I L SAL L L L 180 

Qy 207 WAVAGWLAKG S WN S D I T L LT I SFIFPLIGHVTGFLLALFTHQSWQRCRTISLET 262 

| : : : | | | I : : | : : I h I : : : I : I 

Db 181 VIIL — LALAK-DWQTFWYAPVVGLAALAFNLLSIA^ 237 

Qy 263 GAQN I QMC I TMLQ L S FT AEH L VQML S F P LAYGL FQLIDGFLI VAAY 308 

I I : I I I I : : I III: I I I : 

Db 238 G I HN GT LAI A- LAL S P S L LNN S TMAI P P AI YGVLM FFTAAAF 27 8 



RESULT 15 
E70482 



Na(+) dependent transporter (Sbf family) - Aquifex aeolicus 
C; Species: Aquifex aeolicus 

C;Date: 08-May-1998 #sequence_revision 08-May-1998 #text_change 09-Jun-2000 
C; Accession: E704 82 

R;Deckert, G. ; Warren, P.V.; Gaasterland, T.; Young, W.G.; Lenox, A.L.; Graham, 
D.E.; Overbeek, R. ; Snead, M.A. ; Keller, M. ; Aujay, M. ; Huber, R. ; Feldman, 
R.A. ; Short, J.M. ; Olson, G.J.; Swanson, R.V. 
Nature 392, 353-358, 1998 

A;Title: The complete genome of the hyperthermophilic bacterium Aquifex 
aeolicus . 

A; Reference number: A70300; MUID : 98 196666 ; PMID: 9537320 
A; Accession: E70482 

A; Status: preliminary; nucleic acid sequence not shown; translation not shown 
A;Molecule type: DNA 
A; Residues: 1-297 <AQF> 

A;Cross-references: GB:AE000774; NID : g2984324 ; PIDN : AAC07 854 . 1 ; PID: g2984333 ; 
GB:AE000657 

A; Experimental source: strain VF5 
C; Genetics : 
A; Gene: sbf 

C; Superfamily: Bacillus subtilis sodium-dependent transporter yocS 

Query Match 12.6%; Score 250; DB 2; Length 297; 

Best Local Similarity 26.0%; Pred. No. 3.5e-13; 

Matches 81; Conservative 64; Mismatches 126; Indels 40; Gaps 10; 

Qy 2 9 NLELVFTWSTVMMGLL MFSLGCSVEIRKLWSHIRRPWGIA 69 

: : : I | : : | : | I I : I : : I : I : : 

Db 3 DFSFLLILVSLSLLGILFPEFFANLKPLILPLLIVIMLSMGLTLTPEDFKEIARKPFIVF 62 

Qy 70 VGLLCQFGLMPFTAYLLAISFSLKPVQAIAVLIMGCCPGGTISNIFTFWVDGDMDLSISM 129 

| I I : : I I : I I I : III : I : : : I I I I I I I : I : I I : I I I I 

Db 63 YGALLQYTVMPLSGYLLSKLFKLPPELLVGWLVGSAPGGTASNLITYLSRGDLSYSISM 122 

Qy 130 TTCSTVAALGMMPLCIYLYTWSWSLQQNLTIPYQNI-GITLVCLTIPVAFGV YVNY 184 

|| ||: : 111: : : : I : : : I I : : I I I : : : I 

Db 123 TTTSTLLSPLFTPLWTYVLAGKY VEVPFLSMFETTLKIVIVPVLLGMVLRYFLRY 177 

Qy 185 RWPKQSKIILKIGAWGGVLLLWAVAGWLAKGSWNSDITLLTISFIFPLIGHVTGFLL 244 

: I I I || : I : : I I : : I : | : : | : : | : | : | 

Db 178 QINKVEK-FLPFLAVFS — ISLIIAVIFALNSKLLKELSFLVLSWLIHNVLGFLLGYLF 234 

Qy 245 ALFTHQSWQRCRTISLETGAQNIQMCITMLQLSFTAEHLVQMLSFPLAYGLFQLIDGFLI 304 

| : : : : | : | | | | : I : I I : : : : : I I : I I II 
Db 235 GLLAGL D KRKVKAL S I EVGMQN S GL S - T VLAL K Y FS KVSALPSA--IFSLSQN-LI 286 

Qy 305 VAAYQTYKRRL 315 

: I I I 

Db 287 GWLSLFFRRL 2 97 



Search completed: March 23, 2004, 14:37:49 
Job time : 22 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM protein - protein search, using sw model 
Run on: March 23, 2004, 14:35:53 ; 

Title: US- 10-091- 62 8-2 

Perfect score: 1979 



Search time 4 5 Seconds 

(without alignments) 

2169.470 Million cell updates/sec 



Sequence : 
Scoring table: 



1 MRANCS SS SAC PANS SEEEL. PGPMDCHRALEPVGHITSCE 377 

BLOSUM62 

Gapop 10.0 , Gapext 0.5 



Searched: 1049977 seqs, 258955339 residues 

Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 



1049977 



Post-processing: 



Minimum Match 0% 
Maximum Match 100% 
Listing first 45 summaries 



Database 



Published_Applications_AA: * 
1: /cgn2_6/ptodata/l/pubpaa/US07_PUBCOMB.pep: *' 
2 : /cgn2_6/ptodata/l/pubpaa/PCT_NEW_PUB.pep: * 
3: /cgn2_6/ptodata/l/pubpaa/US06_NEW_PUB.pep:* 
4: /cgn2_6/ptodata/l/pubpaa/US06_PUBC0MB.pep: * 
5: /cgn2_6/ptodata/l/pubpaa/US07_NEW_PUB.pep: * 
6: /cgn2_6/ptodata/l/pubpaa/PCTUS_PUBCOMB.pep:* 
7 : /cgn2_6/ptodata/l/pubpaa/US08_NEW_PUB.pep:* 
8 : /cgn2_6/ptodata/l/pubpaa/US08_PUBCOMB.pep: * 
9: /cgn2_6/ptodata/l/pubpaa/US09A_PUBCOMB.pep:+ 
10: /cgn2_6/ptodata/l/pubpaa/US09B_PUBCOMB.pep:* 
11: /cgn2_6/ptodata/l/pubpaa/US09C_PUBCOMB.pep: * 
12 : /cgn2_6/ptodata/l/pubpaa/US09_NEW_PUB . pep : * 
13: /cgn2_6/ptodata/l/pubpaa/US10A_PUBCOMB.pep:* 
14: /cgn2_6/ptodata/l/pubpaa/US10B_PUBCOMB.pep:* 
15: /cgn2_6/ptodata/l/pubpaa/USlOC_PUBCOMB.pep:* 
16: /cgn2_6/ptodata/l/pubpaa/US10_NEW_PUB.pep:* 
17: /cgn2_6/ptodata/l/pubpaa/US60_NEW_PUB.pep: + 
18 : /cgn2_6/ptodata/ l/pubpaa/US60_PUBCOMB . pep : * 

Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 
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ALIGNMENTS 



RESULT 1 
US-10-091-628-2 

; Sequence 2, Application US/10091628 
; Publication No. US20020164627A1 
; GENERAL INFORMATION: 

; APPLICANT: Wilganowski, Nathaniel L. 



; APPLICANT: Nepomnichy, Boris 
; APPLICANT: Burnett, Michael B. 
; APPLICANT: Hu, Yi 

; TITLE OF INVENTION: No. US20020164627Alel Human Transporter Proteins and 

Polynucleotides Encoding the 

; TITLE OF INVENTION: Same 

; FILE REFERENCE: LEX-0314-USA 

; CURRENT APPLICATION NUMBER: US/10/091 , 628 

; CURRENT FILING DATE: 2002-03-06 

; PRIOR APPLICATION NUMBER: US 60/275,009 

; PRIOR FILING DATE: 2001-03-12 

; PRIOR APPLICATION NUMBER: US 60/284,152 

; PRIOR FILING DATE: 2001-04-17 

; NUMBER OF SEQ ID NOS : 6 

; SOFTWARE: FastSEQ for Windows Version 4.0 
; SEQ ID NO 2 
; LENGTH: 377 
TYPE: PRT 

ORGANISM: Homo sapiens 
US-10-091-628-2 

Query Match 100.0%; Score 1979; DB 13; Length 377; 

Best Local Similarity 100.0%; Pred. No. 4.2e-181; 

Matches 377; Conservative 0; Mismatches 0; Indels 0; Gaps 

Qy 1 MRANCS S S S ACPANS S EEELPVGLEVHGNLELVFTWSTVMMGLLMFS LGCSVEI RKLWS 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1 MRANCSSSSACPANSSEEELPVGLEVHGNLELVFTWSTVMMGLLMFSLGCSVEI RKLWS 

Qy 61 HIRRPWGIAVGLLCQFGLMPFTAYLLAISFSLKPVQAIAVLIMGCCPGGTISNIFT FWVD 

I | I | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 61 HIRRPWGIAVGLLCQFGLMPFTAYLLAISFSLKPVQAIAVLIMGCCPGGTISNIFTFWVD 

Qy 121 GDMDLSISMTTCSTVAALGMMPLCIYLYTWSWSLQQNLTIPYQNIGITLVCLTIPVAFGV 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 121 GDMDLSISMTTCSTVAALGMMPLCIYLYTWSWSLQQNLTIPYQNIGITLVCLTIPVAFGV 

Qy 181 YWYRWPKQSKIILKIGAWGGVLLLWAVAGWLAKGSWNSDITLLTISFIFPLIGHVT 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 181 YVN Y RW PKQSKIILKI GAWGGVL L L WAVAG WLAKG S WN SDITLLTISFIFPLI GHVT 

Qy 241 GFLLALFTHQSWQRCRTISLETGAQNIQMCITMLQLSFTAEHLVQMLSFPLAYGLFQLID 

I I I I II I I I I I I I I I II I I I I I I I I I I I I I I I I III I I I I I I I I I I I I I I I I I I I I I M I 
Db 241 GFLLALFTHQSWQRCRTISLETGAQNIQMCITMLQLSFTAEHLVQMLSFPLAYGLFQLID 

Qy 301 GFLIVAAYQTYKRRLKNKHGKKNSGCTEVCHTRKSTSSRETNAFLEVNEEGAITPGPPGP 

I I I I I I I I I I I I I I I I I I ! I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 301 GFLIVAAYQTYKRRLKNKHGKKNSGCTEVCHTRKSTSSRETNAFLEVNEEGAITPGPPGP 

Qy 361 MDCHRALEPVGHITSCE 377 

II I I I I I I I I I I I I I I I 

Db 361 MDCHRALEPVGHITSCE 377 



RESULT 2 

US-09-981-151A-40 

; Sequence 40, Application US/09981151A 



0; 
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Publication No. US200302 12256A1 
GENERAL INFORMATION: 

APPLICANT: Edinger, Shlomit R 

APPLICANT: Gerlach, Valerie 

APPLICANT: MacDougall, John R 

APPLICANT: Malyankar, Muriel M 

APPLICANT: Smithson, Glennda 

APPLICANT: Millet, Isabelle 

APPLICANT: Peyman, John A 

APPLICANT: Stone, David J 

APPLICANT: Gunther, Erik 

APPLICANT: Ellerman, Karen 

APPLICANT: Shimkets, Richard A 

APPLICANT: Padigaru, Muralidhara 

APPLICANT: Guo, Xiaojia 

APPLICANT: Patturajan, Meera 

APPLICANT: Taupier Jr, Raymond J 

APPLICANT: Burgess, Catherine E 

APPLICANT: Zerhusen, Bryan D 

APPLICANT: Kekuda, Ramesh 

APPLICANT: Spytek, Kimberly A 

APPLICANT: Gangolli, Esha A 

APPLICANT: Fernandes, Elma R 

APPLICANT: Gorman, Linda 

TITLE OF INVENTION: Proteins and Nucleic Acids Encoding Same 
FILE REFERENCE: 21402-168 

CURRENT APPLICATION NUMBER: US/ 09/ 98 1 , 151A 
CURRENT FILING DATE: 2001-10-16 
PRIOR APPLICATION NUMBER: 60/241,040 
PRIOR FILING DATE: 2000-10-17 
PRIOR APPLICATION NUMBER: 60/241,058 
PRIOR FILING DATE: 2000-10-17 
PRIOR APPLICATION NUMBER: 60/241,063 
PRIOR FILING DATE: 2000-10-17 
PRIOR APPLICATION NUMBER: 60/241,243 
PRIOR FILING DATE: 2000-10-17 
PRIOR APPLICATION NUMBER: 60/242,152 
PRIOR FILING DATE: 2000-10-20 
PRIOR APPLICATION NUMBER: 60/242,482 
PRIOR FILING DATE: 2000-10-23 
PRIOR APPLICATION NUMBER: 60/242,611 
PRIOR FILING DATE: 2000-10-23 
PRIOR APPLICATION NUMBER: 60/242,612 
PRIOR FILING DATE: 2000-10-23 
PRIOR APPLICATION NUMBER: 60/242,880 
PRIOR FILING DATE: 2000-10-24 
PRIOR APPLICATION NUMBER: 60/242,881 
PRIOR FILING DATE: 2000-10-24 
Remaining Prior Application data removed 
NUMBER OF SEQ ID NOS : 160 
SOFTWARE: Patentln Ver. 2.1 
SEQ ID NO 4 0 

LENGTH: 373 

TYPE: PRT 

ORGANISM: Mus musculus 
US-09-981-151A-40 



See File Wrapper or PALM. 



Query Match 71.5%; Score 1415; DB 11; 

Best Local Similarity 70.3%; Pred. No. 5e-127; 
Matches 265; Conservative 50; Mismatches 58; 



Length 373; 

Indels 4; Gaps 



2; 



Qy 


1 


MRANCSSSSACPANSSEEELPVGLEVHGNLELVFTWSTVMMGLLMFSLGCSVEIRKLWS 

1 : 1 : : 1 II 1 1 : t 1 : 1 1 1 : 1 1 1 1 : 1 : 1 1 1 : 1 1 1 : 1 1 : 1 1 1 1 1 1 1 1 : 1 1 1 

MSTDCAGNSTCPVNSTEEDPPVGMEGHANLKLLFTVLSAVMVGLVMFSFGCSVESQKLWL 


60 


Db 


1 


60 


Qy 

Db 


61 
61 


HIRRPWGIAVGLLCQFGLMPFTAYLLAISFSLKPVQAIAVLIMGCCPGGTISNIFTFWVD 

1:11111111111 1 1 1 1 1 1 1 1 1 1 1 II 1 III 111111:11 1 1 1 1 1 1 1 1 : Mill 

HLRRPWGIAVGLLSQFGLMPLTAYLLAIGFGLKPFQAIAVLMMGSCPGGTISNVLTFWVD 


120 
120 


Qy 


121 


GDMDLSISMTTCSTVAALGMMPLCIYLYTWSWSLQQNLTIPYQNIGITLVCLTIPVAFGV 

I 1 I 1 1 1 1 1 II 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 : 1 : 1 1 1 1 : 1 III 1111:111111 1 : 1 1 1 II 

GDMDLSISMTTCSTVAALGMMPLCLYIYTRSWTLTQNLVIPYQSIGITLVS LWPVASGV 


180 


Db 


121 


180 


Qy 


181 


YVN Y RW PKQSKIILKI GAVVGGVLL L VVAVAG WLAKG S WN SDITLLTISFIFPLI GHVT 
1 I I I I I || I : : 1 II : II : : 1 1 : 1 1 1 1 1 1 1 Mill! 1 1 : 1 : 1 1 1 II MINIMI 
YVN YRW P KQAT VI L KVGAI LGGML L LWAVT GMVLAKG- WN T DVT L LVI SCI FPLVGHVT 


240 


Db 


181 


239 


Qy 


241 


G FL LAL FT HQ S WQ RC RT I S L ET GAQN I QMC I TMLQ L S FT AEH LVQML S F P LAY GL FQ L I D 
I I I I i 1 1 M 1 1 1 1 1 1 1 1 • 1 1 1 1 1 1 1 1 • 1 1 1 1 1 1 1 1 • II • 1 1 1 • t • 1 1 1 1 1 1 1 1 : : 

1 1 1 1 1 l l l || 1 1 1 1 1 1 1 • 1 1 1 1 1 1 1 1 • 1 1 II M II • 1 1 • 1 1 1 • 1 • 1 1 1 1 1 1 1 I * * 

GFLLAFLTHQSWQRCRTISIETGAQNIQLCIAMLQLSFSAEYLVQLLNFALAYGLFQVLH 


300 


Db 


240 


299 


Qy 


301 


GFLIVAAYQTYKRRLKNKHGKKNSGCTEVCHTRKSTSSRETNAFLEVNEEGAITPGPPGP 

I I I I I I II 1 1 1 1 1 : 1 ::: | :||: :: MMMM M Ml II 1 
GLLIVAAYQAYKRRQKSKCRRQHPDCPDVCYEKQ PRETSAFLDKGDEAAVTLGPVQP 


360 


Db 


300 


356 


Qy 


361 


MDCHRALEPVGHITSCE 377 

1 1 1 1 Mill 
EQHHRAAELTSHIPSCE 373 




Db 


357 





RESULT 3 

US-09-981-151A-12 

Sequence 12, Application US/09981151A 
Publication No. US20030212256A1 
GENERAL INFORMATION: 



APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 



Edinger, Shlornit R 
Gerlach, Valerie 
MacDougall, John R 
Malyankar, Muriel M 
Smithson, Glennda 
Millet, Isabelle 
Peyman, John A 
Stone, David J 
Gunther, Erik 
Ellerman, Karen 
Shimkets, Richard A 
Padigaru, Muralidhara 
Guo, Xiaojia 
Patturajan, Meera 
Taupier Jr, Raymond J 
Burgess, Catherine E 
Zerhusen, Bryan D 
Kekuda, Ramesh 
Spytek, Kimberly A 



APPLICANT: Gangolli, Esha A 
APPLICANT: Fernancies, Elma R 
APPLICANT: Gorman, Linda 

TITLE OF INVENTION: Proteins and Nucleic Acids Encoding Same 
FILE REFERENCE: 21402-168 

CURRENT APPLICATION NUMBER: US/09/981, 151A 
CURRENT FILING DATE: 2001-10-16 
PRIOR APPLICATION NUMBER: 60/241,040 
PRIOR FILING DATE: 2000-10-17 
PRIOR APPLICATION NUMBER: 60/241,058 
PRIOR FILING DATE: 2000-10-17 
PRIOR APPLICATION NUMBER: 60/241,063 
PRIOR FILING DATE: 2000-10-17 
PRIOR APPLICATION NUMBER: 60/241,243 
PRIOR FILING DATE: 2000-10-17 
PRIOR APPLICATION NUMBER: 60/242,152 
PRIOR FILING DATE: 2000-10-20 
PRIOR APPLICATION NUMBER: 60/242,482 
PRIOR FILING DATE: 2000-10-23 
PRIOR APPLICATION NUMBER: 60/242,611 
PRIOR FILING DATE: 2000-10-23 
PRIOR APPLICATION NUMBER: 60/242,612 
PRIOR FILING DATE: 2000-10-23 
PRIOR APPLICATION NUMBER: 60/242,880 
PRIOR FILING DATE: 2000-10-24 
PRIOR APPLICATION NUMBER: 60/242,881 
PRIOR FILING DATE: 2000-10-24 

Remaining Prior Application data removed - See File Wrapper or PALM. 
NUMBER OF SEQ ID NOS : 160 
SOFTWARE: Patentln Ver . 2.1 
SEQ ID NO 12 
LENGTH: 326 
TYPE: PRT 

ORGANISM: Homo sapiens 
US-09-981-151A-12 

Query Match 60.7%; Score 1200.5; DB 11; Length 326; 

Best Local Similarity 80.6%; Pred. No. 1.5e-106; 

Matches 250; Conservative 11; Mismatches 30; Indels 19; Gaps 6; 

MRAN C S S S SAC PAN S S E E E L P VG L E VH GN L E L VFT WS T VMMG L LMF S L GC S VE I RK LW S 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I M I I I I I I I M 

MRANCSSSSACPANSSEEELPVGLEVHGNLELVFTWSTIMMGLLMFSLGCSVEIRKLWS 60 

HI RRPWGI AVGLLCQFGLMP FTAYLLAI S FS LKPVQAI AVL IMGCCPGGT I SNI FTFWVD 120 

I I I I ! I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I M II I I I I : II 
HI RRPWGI AVGLLCQFGLMP FTAYLLAI SFS LKPVQAI AVLIMGCCRG APSLTFSPS 117 

GDMDLSISMTTCSTVAALGMMPLCIYLYTWSWSLQQNLTIPYQNI GITLVCLTIPV 176 

I | :: || | | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 



Qy 


1 


Db 


1 


Qy 


61 


Db 


61 


Qy 


121 


Db 


118 


Qy 


177 


Db 


171 



236 



I I I I I I I I I II I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 



Qy 



237 GHVTGFLLALFTHQSWQRCRTISLETGAQNIQMCITMLQLSFTAEHLVQMLSF-PLAYGL 295 



I I I I I I I I I I I I I I I I I II:: I I I : I : : I :: I I I I I I 

Db 229 GHVTGFLLALFTHQSWQ — RTLPIFLGLAFKTPCDTLLAMTSCPECSRLIYAFIPLLYGL 286 

Qy 296 FQLIDGFLIV 305 

I I I I I I I I I I 
Db 287 FQLIDGFLIV 296 



RESULT 4 

US-09-981-151A-41 

; Sequence 41, Application US/09981151A 

; Publication No. US20030212256A1 

; GENERAL INFORMATION: 

; APPLICANT 

; APPLICANT 

; APPLICANT 

; APPLICANT 

; APPLICANT 

; APPLICANT 

; APPLICANT 

; APPLICANT 

; APPLICANT 

; APPLICANT 

; APPLICANT 

; APPLICANT 

; APPLICANT 

; APPLICANT 

; APPLICANT 

; APPLICANT 

; APPLICANT 

; APPLICANT 

; APPLICANT 

; APPLICANT 

; APPLICANT 

; APPLICANT 

TITLE OF INVENTION: Proteins and Nucleic Acids Encoding Same 

; FILE REFERENCE: 21402-168 

; CURRENT APPLICATION NUMBER: US/ 09/ 981 , 151A 

; CURRENT FILING DATE: 2 001-10-16 

; PRIOR APPLICATION NUMBER: 60/241,040 

; PRIOR FILING DATE: 2000-10-17 

; PRIOR APPLICATION NUMBER: 60/241,058 

; PRIOR FILING DATE: 2000-10-17 

; PRIOR APPLICATION NUMBER: 60/241,063 

; PRIOR FILING DATE: 2000-10-17 

; PRIOR APPLICATION NUMBER: 60/241,243 

; PRIOR FILING DATE: 2000-10-17 

; PRIOR APPLICATION NUMBER: 60/242,152 

; PRIOR FILING DATE: 2000-10-20 

; PRIOR APPLICATION NUMBER: 60/242,482 

; PRIOR FILING DATE: 2000-10-23 

; PRIOR APPLICATION NUMBER: 60/242,611 

; PRIOR FILING DATE: 2000-10-23 

; PRIOR APPLICATION NUMBER: 60/242,612 

; PRIOR FILING DATE: 2000-10-23 

; PRIOR APPLICATION NUMBER: 60/242,880 

; PRIOR FILING DATE: 2000-10-24 



Gerlach, Valerie 
MacDougall, John R 
Malyankar, Muriel M 
Smithson, Glennda 
Millet, Isabelle 
Peyman, John A 
Stone, David J 
Gunther, Erik 
Ellerman, Karen 
Shimkets, Richard A 
Padigaru, Muralidhara 
Guo, Xiaojia 
Patturajan, Meera 
Taupier Jr, Raymond J 
Burgess, Catherine E 
Zerhusen, Bryan D 
Kekuda, Ramesh 
Spytek, Kimberly A 
Gangolli, Esha A 
Fernandes, Elma R 
Gorman, Linda 



PRIOR APPLICATION NUMBER: 60/242,881 
PRIOR FILING DATE: 2000-10-24 

Remaining Prior Application data removed - See File Wrapper or PALM. 
NUMBER OF SEQ ID NOS : 160 
SOFTWARE: Patentln Ver. 2.1 
SEQ ID NO 41 
LENGTH: 347 
TYPE: PRT 

ORGANISM: Orycctolagus cuniculus 
US-09-981-151A-41 

Query Match 44.8%; Score 886; DB 11; Length 347; 

Best Local Similarity 47.0%; Pred. No. 2.4e-76; 

Matches 164; Conservative 73; Mismatches 98; Indels 14; Gaps 5; 

Qy 11 CPANSS — EEELPVGLEVHGN — LELVFTWSTVMMGLLMFSLGCSVEIRKLWSHIRRPW 66 

III:: I I I : I I : I : I I : : : I : I I I : I I : I I I : I I I I I I I 
Db 8 C LANAT VCE GAS C VAP E SN FNAI L SWL S T VLT I LLALVMFSMGCN VE I KKFL GH I RRP W 67 

Qy 67 GIAVGLLCQFGLMPFTAYLLAISFSLKPVQAIAVLIMGCCPGGTISNIFTFWVDGDMDLS 126 

I | : | I I I I I : I I I : : I I : : I : I : I I : I I I I I I I I I I I I I I : I I I I I I I I I 

Db 68 GIFIGFLCQFGIMPLTGFVLAVAFGIMPIQAVWLIMGCCPGGTASNILAYWVDGDMDLS 127 

Qy 127 ISMTTCSTVA7VLGMMPLCIYLYTWSWSLQQNLTIPYQNIGITLVCLTIPVAFGVYVNYRW 186 

: I I I I I I I : I II I I I I I : I : II I : Ml III : I I I : I I : I : : I I : : I 

Db 128 VSMTTCSTLIAiGMMPLCLYWTKMWDSGTIVIPYDNIGTSLVALWPVSIGMFVNHKW 187 

Qy 187 PKQSKIILKI G AWG G VL L LWAVAGWLAK G S WN SDITLLTISFIFPLIG H VT G F L LAL 246 

I : : : I I I II : I : : I I I : : : : I I I : I : : I : I I I I I : I : II I I 

Db 188 PQKAKIILKVGSIAGAVLIVLIAWGGILYQSAWIIEPKLWIIGTIFPMAGYSLGFFLAR 247 

Qy 247 FTHQSWQRCRTISLETGAQNIQMCITMLQLSFTAEHLVQMLSFPLAYGLFQLIDGFLIVA 306 

I I II I I :: I I I I I I I : I I :: I I I I : I I : : I I I I : I I : : : 
Db 248 IAGQPWYRCRTVALETGMQNTQLCSTIVQLSFSPEDLTYVFTFPLIYSIFQIAFAAI FLG 307 

Qy 307 AYQTYKRRLKNKHGKKNSGCTEVCHTRKSTSSRETNAFLEVNEEGAITP 355 

II:: III:: : : I I : : : I : : I I I 

Db 308 IYVAYRK CHGKNDAEFPDI KDTKTEPESSFHQMN — GGFQP 346 



RESULT 5 

US-09-981-151A-45 

Sequence 45, Application US/09981151A 
Publication No. US20030212256A1 
GENERAL INFORMATION: 
APPLICANT: Edinger, Shlomit R 
APPLICANT: Gerlach, Valerie 
APPLICANT: MacDougall, John R 
APPLICANT: Malyankar, Muriel M 
APPLICANT: Smithson, Glennda 
APPLICANT: Millet, Isabelle 
APPLICANT: Peyman, John A 
APPLICANT: Stone, David J 
APPLICANT: Gunther, Erik 
APPLICANT: Ellerman, Karen 
APPLICANT: Shimkets, Richard A 
APPLICANT: Padigaru, Muralidhara 



APPLICANT: Guo, Xiaojia 
APPLICANT: Patturajan, Meera 
APPLICANT: Taupier Jr, Raymond J 
APPLICANT: Burgess, Catherine E 
APPLICANT: Zerhusen, Bryan^D 
APPLICANT: Kekuda, Ramesh 
APPLICANT: Spytek, Kimberly A 
APPLICANT: Gangolli, Esha A 
APPLICANT: Fernandes, Elma R 
APPLICANT: Gorman, Linda 

TITLE OF INVENTION: Proteins and Nucleic Acids Encoding Same 
FILE REFERENCE: 21402-168 

CURRENT APPLICATION NUMBER: US/09/981, 151A 
CURRENT FILING DATE: 2001-10-16 
PRIOR APPLICATION NUMBER: 60/241,040 
PRIOR FILING DATE: 2000-10-17 
PRIOR APPLICATION NUMBER: 60/241,058 
PRIOR FILING DATE: 2000-10-17 
PRIOR APPLICATION NUMBER: 60/241,063 
PRIOR FILING DATE: 2000-10-17 
PRIOR APPLICATION NUMBER: 60/241,243 
PRIOR FILING DATE: 2000-10-17 
PRIOR APPLICATION NUMBER: 60/242,152 
PRIOR FILING DATE: 2000-10-20 
PRIOR APPLICATION NUMBER: 60/242,482 
PRIOR FILING DATE: 2000-10-23 
PRIOR APPLICATION NUMBER: 60/242,611 
PRIOR FILING DATE: 2000-10-23 
PRIOR APPLICATION NUMBER: 60/242,612 
PRIOR FILING DATE: 2000-10-23 
PRIOR APPLICATION NUMBER: 60/242,880 
PRIOR FILING DATE: 2000-10-24 
PRIOR APPLICATION NUMBER: 60/242,881 
PRIOR FILING DATE: 2000-10-24 

Remaining Prior Application data removed - See File Wrapper or PALM. 
NUMBER OF SEQ ID NOS : 160 
SOFTWARE: PatentlnVer. 2.1 
SEQ ID NO 45 
LENGTH: 34 8 
TYPE: PRT 

ORGANISM: Homo sapiens 
US-09-981-151A-45 

Query Match 44.7%; Score 884; DB 11; Length 348; 

Best Local Similarity 46.9%; Pred. No. 3.7e-76; 

Matches 164; Conservative 74; Mismatches 102; Indels 10; Gaps 4; 

Qy 7 S S SAC PAN SS — EEEL P VGLEVHGN — LELVFTWSTVMMGLLMFSLGCSVEIRKLWSHI 62 

: I I I I : : I : : I : I I : I : I I : : : I : I I I : I I : I I : I I : 
Db 3 NSSICNPNATICEGDSCIAPESNFNAILSWMSTVLTILIALVMFSMGCNVELHKFLGHL 62 

Qy 63 RRPWGIAVGLLCQFGLMPFTAYLLAISFSLKPVQAIAVLIMGCCPGGTISNIFTFWVDGD 122 

I I I I I I I I I I I I I : I I I :: I ::: I : I I I I : I I I I I I I I I I I I I : I I I I I 
Db 63 RRPWGIWGFLCQFGIMPLTGFVLSVAFGILPVQAVWLIQGCCPGGTASNILAYWVDGD 122 



Qy 



123 MDLSISMTTCSTVAALGMMPLCIYLYTWSWSLQQNLTIPYQNIGITLVCLTIPVAFGVYV 182 
I I I I : I I I I I I I : I I I I I I I I ::: I I I : I I I : I I : I I I I I I : I : I I 



Db 



123 MDLSVSMTTCSTLLALGMMPLCLFIYTKMWVDSGTIVIPYDSIGTSLVALVIPVSIGMYV 182 



Qy 183 NYRWPKQSKIILKI GAWGGVLL L WAVAG WLAKG S WN SDITLLTISFIFPLI GH VT G F 242 

I : : I I : : : I I I I I I I : : I : I : : : : I I I : I : : I : I I I : I : h I I 

Db 183 NHKWPQKAKIILKIGSIAGAILIVLIAWGGILYQSAWTIEPKLWIIGTIYPIAGYGLGF 242 

Qy 243 LLALFTHQSWQRCRTISLETGAQNIQMCITMLQLSFTAEHLVQMLSFPLAYGLFQLIDGF 302 

II I I I I II :: I I I I I I I : I I :: I II I : I I : : I I I I : I I : 

Db 243 FLARIAGQPWYRCRTVALETGLQNTQLCSTIVQLSFSPEDLNLVFTFPLIYSIFQIAFAA 302 

Qy 303 LIVAAYQTYKRRLKNKHGKKNSGCTEVCHTRKS— TSSRETNAFLEVNEE 350 

: : : M I I : I I I I : I : I : I I I : : I : 

Db 303 ILL GAYVAY KK CHGKNNTELQEKTDNEMEPRSSFQETNKGFQPDEK 348 



RESULT 6 

US-09-981-151A-42 

Sequence 42, Application US/09981151A 
Publication No. US20030212256A1 
GENERAL INFORMATION: 
APPLICANT: Edinger, Shlomit R 
Gerlach, Valerie 
MacDougall, John R 
Malyankar, Muriel M 
Smithson, Glennda 
Millet, Isabelle 
Peyman, John A 
Stone, David J 
Gunther, Erik 
Ellerman, Karen 
Shirnkets, Richard A 
Padigaru, Muralidhara 
Guo, Xiaojia 
Patturajan, Meera 
Taupier Jr, Raymond J 
Burgess, Catherine E 
Zerhusen, Bryan D 
Kekuda, Ramesh 
Spytek, Kimberly A 
Gangolli, Esha A 
Fernandes, Elma R 
Gorman, Linda 

TITLE OF INVENTION: Proteins and Nucleic Acids Encoding Same 
FILE REFERENCE: 21402-168 

CURRENT APPLICATION NUMBER: US/09/981, 151A 
CURRENT FILING DATE: 2001-10-16 
PRIOR APPLICATION NUMBER: 60/241,040 
PRIOR FILING DATE: 2000-10-17 
PRIOR APPLICATION NUMBER: 60/241,058 
PRIOR FILING DATE: 2000-10-17 
PRIOR APPLICATION NUMBER: 60/241,063 
PRIOR FILING DATE: 2000-10-17 
PRIOR APPLICATION NUMBER: 60/241,243 
PRIOR FILING DATE: 2000-10-17 
PRIOR APPLICATION NUMBER: 60/242,152 
PRIOR FILING DATE: 2000-10-20 
PRIOR APPLICATION NUMBER: 60/242,482 



APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT : 
APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 



PRIOR FILING DATE: 2000-10-23 
PRIOR APPLICATION NUMBER: 60/242,611 
PRIOR FILING DATE: 2000-10-23 
PRIOR APPLICATION NUMBER: 60/242,612 
PRIOR FILING DATE: 2000-10-23 
PRIOR APPLICATION NUMBER: 60/242,880 
PRIOR FILING DATE: 2000-10-24 
PRIOR APPLICATION NUMBER: 60/242,881 
PRIOR FILING DATE: 2000-10-24 

Remaining Prior Application data removed - See File Wrapper or PALM. 
NUMBER OF SEQ ID NOS : 160 
SOFTWARE: Patentln Ver. 2.1 
SEQ ID NO 42 
LENGTH: 348 
TYPE: PRT 

ORGANISM: Rattus norvegicus 
US-09-981-151A-42 

Query Match 44.0%; Score 871; DB 11; Length 348; 

Best Local Similarity 47.1%; Pred. No. 6.6e-75; 

Matches 165; Conservative 70; Mismatches 105; Indels 10; Gaps 3; 

Qy 7 SSSACPANSSEEELPVGLEVHGN L E LVFTWS T VMMGL LMF S L GC S VE I RKLW SHI 62 

: I I I | : : I I I I I : I I : : : : : I I I : I I : I I I I II 

Db 3 NSSVCSPNATFCEGDSCLVTESNFNAILSTVMSTVLTILLAMVMFSMGCNVEINKFLGHI 62 

Qy 63 RRPWGIAVGLLCQFGLMPFTAYLLAISFSLKPVQAIAVLIMGCCPGGTISNIFTFWVDGD 122 

: | | | | | || 11111:11 I : : I : : : : MM: I I I I I I I I I I I III : I : I I I 

Db 63 KRPWGIFVGFLCQFGIMPLTGFILSVASGILPVQAVWLIMGCCPGGTGSNILAYWIDGD 122 

Qy 123 MDLSI SMTTCSTVAALGMMPLCI YLYTWSWSLQQNLTI P YQNI GITLVCLTI PVAFGVYV 182 

I I I I : I II I I I I : I I I I I I I I : : : I I I : I I I : I I I : I I I I I I : I : : I 

Db 123 MDLSVSMTTCSTLLALGMMPLCLFIYTKMWVDSGTIVIPYDSIGISLVALVIPVSIGMFV 182 

Qy 183 NYRWPKQSKIILKIGAWGGVLLLWAVAGWLAKGSWNSDITLLTISFIFPLIGHVTGF 242 

I : : I I : : : I I I I I I I : : I : I : : : : I I I : I : : I : I I I I I : I : II 

Db 183 NHKWPQKAKIILKIGSIAGAILIVLIAWGGILYQSAWIIEPKLWIIGTIFPIAGYSLGF 242 

Qy 243 LLALFTHQSWQRCRTISLETGAQNIQMCITMLQLSFTAEHLVQMLSFPLAYGLFQLIDGF 302 

II I I I I I I :: I I I I I I I : I I I I I I : I I : : I I I I : I I I : 

Db 24 3 FLARLAGQ P W YRC RT VAL ET GMQNT Q L C S T I VQ L S F S P E DLN L VFT F P L I YT VFQ L VFAA 302 

Qy 303 LI VAAYQT YKRRLKNKHGKKN S GCTEVCHT RKS — TSSRETNAFLEVNEE 350 

: | : | I I I : III:: I 1=111 : : I : 

Db 303 IILGMYVTYKK CHGKNDAEFLEKTDNDMDPMPSFQETNKGFQPDEK 348 



RESULT 7 

US-09-981-151A-43 

Sequence 43, Application US/09981151A 
Publication No. US20030212256A1 
GENERAL INFORMATION: 
APPLICANT: Edinger, Shlomit R 
APPLICANT: Gerlach, Valerie 
APPLICANT: MacDougall, John R 
APPLICANT: Malyankar, Muriel M 
APPLICANT: Smithson, Glennda 



APPLICANT: Millet, Isabelle 
APPLICANT: Peyman, John A 
APPLICANT: Stone, David J 
APPLICANT: Gunther, Erik 
APPLICANT: Ellerman, Karen 
APPLICANT: Shimkets, Richard A 
APPLICANT: Padigaru, Muralidhara 
APPLICANT: Guo, Xiaojia 
APPLICANT: Patturajan, Meera 
APPLICANT: Taupier Jr, Raymond J 
APPLICANT: Burgess, Catherine E 
APPLICANT: Zerhusen, Bryan D 
APPLICANT: Kekuda, Ramesh 
APPLICANT: Spytek, Kimberly A 
APPLICANT: Gangolli, Esha A 
APPLICANT: Fernandes, Elma R 
APPLICANT: Gorman, Linda 

TITLE OF INVENTION: Proteins and Nucleic Acids Encoding Same 
FILE REFERENCE: 21402-168 

CURRENT APPLICATION NUMBER: US/09/981, 151A 
CURRENT FILING DATE: 2001-10-16 
PRIOR APPLICATION NUMBER: 60/241,040 
PRIOR FILING DATE: 2000-10-17 
PRIOR APPLICATION NUMBER: 60/241,058 
PRIOR FILING DATE: 2000-10-17 
PRIOR APPLICATION NUMBER: 60/241,063 
PRIOR FILING DATE: 2000-10-17 
PRIOR APPLICATION NUMBER: 60/241,243 
PRIOR FILING DATE: 2000-10-17 
PRIOR APPLICATION NUMBER: 60/242,152 
PRIOR FILING DATE: 2000-10-20 
PRIOR APPLICATION NUMBER: 60/242,482 
PRIOR FILING DATE: 2000-10-23 
PRIOR APPLICATION NUMBER: 60/242,611 
PRIOR FILING DATE: 2000-10-23 
PRIOR APPLICATION NUMBER: 60/242,612 
PRIOR FILING DATE: 2000-10-23 
PRIOR APPLICATION NUMBER: 60/242,880 
PRIOR FILING DATE: 2000-10-24 
PRIOR APPLICATION NUMBER: 60/242,881 
PRIOR FILING DATE: 2000-10-24 

Remaining Prior Application data removed - See File Wrapper or PALM. 
NUMBER OF SEQ ID NOS : 160 
SOFTWARE: Patentln Ver. 2.1 
SEQ ID NO 43 
LENGTH: 34 8 
TYPE: PRT 

ORGANISM: Mus mus cuius 
US-09-981-151A-43 



Query Match 44.0%; Score 871; DB 11; Length 348; 

Best Local Similarity 47.4%; Pred. No. 6.6e-75; 

Matches 167; Conservative 74; Mismatches 97; Indels 14; Gaps 5; 

Qy 7 S SSACPANS S - - EEELPVGLEVHGN- - LELVFTVVSTVMMGLLMFS LGCSVEI RKLWS HI 62 

: I I I I I : : I : I I : I I I : I I : : : : : I I I : I I : I I : I II 
Db 3 N S S VCP PNATVCEGDS CWP ESNFNAI LNTVMSTVLT I LLAMVMFSMGCNVEVHKFLGH I 62 



Qy 63 RRPWGIAVGLLCQFGLMPFTAYLLAISFSLKPVQAIAVLIMGCCPGGTISNIFT FWVDGD 122 

: I I i I I I I I | | | | : I I I : : I : : : : I I I I : I I I I I I I I I I I I I I : I : I I I 

Db 63 KRPWGIFVGFLCQFGIMPLTGFILSVASGILPVQAVVVXIMGCCPGGTGSNILAYWIDGD 122 

Qy 123 MDLSISMTTCSTVAALGMMPLCIYLYTWSWSLQQNLTIPYQNIGITLVCLTIPVAFGVYV 182 

I I I I : I I I I I I I : I I I I I I I I ::: I I I : I I I : I I I : I I I I I I : I I : : I 

Db 123 MDLSVSMTTCSTLLALGl^PLCLFVYTKMWVDSGTIVIPYDSIGISLVALVIPVSFGMFV 182 

Qy 183 NYRWPKQSKIILKIGAWGGVLLLWAVAGWIAKGSWNSDITLLTISFIFPLIGHVTGF 242 

I :: I I :: : I I I I I I I :: I : I :: :: I I I : I : : I : I I I I I : I : II 
Db 183 NHKWPQKAKIILKIGSITGVILIVLIAVIGGILYQSAWIIEPKLWIIGTIFPIAGYSLGF 242 

Qy 243 LLALFTHQSWQRCRTISLETGAQNIQMCITMLQLSFTAEHLVQMLSFPLAYGLFQLIDGF 302 

II I I II I I : : I I II II I : I I : : I I I I : I I : : I I I I : I I I : 

Db 243 FLARLAGQPWYRCRTVALETGMQNTQLCSTIVQLSFSPEDLNLVFTFPLIYTVFQLVFAA 302 

Qy 303 LIVAAYQTYKRRLKNKHGKKNSGCTEVCHTRKSTSSR ETNAFLEVNEE 350 

: I : I I I : : : I I : : I I II III : : I : 

Db 303 VILGIYVTYRK CYGKNDAEFLE — KTDNEMDSRPSFDETNKGFQPDEK 348 



RESULT 8 

US-09-981-151A-44 

; Sequence 44, Application US/09981151A 
; Publication No. US20030212256A1 
; GENERAL INFORMATION: 



APPLICANT: 


Edinger, 


Shlomit R 


APPLICANT: 


Gerlach, 


Valerie 


APPLICANT: 


MacDougall, John R 


APPLICANT: 


Malyanka 


r, Muriel M 


APPLICANT: 


Smiths on 


, Glennda 


APPLICANT: 


Millet, 


Isabelle 


APPLICANT 


Peyman, 


John A 


APPLICANT 


Stone, David J 


APPLICANT 


Gunther, 


Erik 


APPLICANT 


Ellerman 


, Karen 


APPLICANT 


Shimkets 


, Richard A 


APPLICANT 


Padigaru 


, Muralidhara 


APPLICANT 


Guo, Xiaojia 


APPLICANT 


: Patturaj 


an, Meera 


APPLICANT 


: Taupier 


Jr, Raymond J 


APPLICANT 


: Burgess, 


Catherine E 


APPLICANT 


: Zerhusen 


, Bryan D 


APPLICANT 


: Kekuda, 


Ramesh 


APPLICANT 


: Spytek, 


Kimberly A 


APPLICANT 


: Gangolli 


, Esha A 


APPLICANT 


: Fernande 


s , Elma R 


APPLICANT 


: Gorman, 


Linda 



TITLE OF INVENTION: Proteins and Nucleic Acids Encoding Same 
FILE REFERENCE: 21402-168 

CURRENT APPLICATION NUMBER: US/09/981, 151A 

CURRENT FILING DATE: 2001-10-16 

PRIOR APPLICATION NUMBER: 60/241,040 

PRIOR FILING DATE: 2000-10-17 

PRIOR APPLICATION NUMBER: 60/241,058 

PRIOR FILING DATE: 2000-10-17 



; PRIOR APPLICATION NUMBER: 60/241,063 

; PRIOR FILING DATE: 2000-10-17 

; PRIOR APPLICATION NUMBER: 60/241,243 

; PRIOR FILING DATE: 2000-10-17 

; PRIOR APPLICATION NUMBER: 60/242,152 

; PRIOR FILING DATE: 2000-10-20 

; PRIOR APPLICATION NUMBER: 60/242,482 

; PRIOR FILING DATE: 2000-10-23 

; PRIOR APPLICATION NUMBER: 60/242,611 

; PRIOR FILING DATE: 2000-10-23 

; PRIOR APPLICATION NUMBER: 60/242,612 

; PRIOR FILING DATE: 2000-10-23 

; PRIOR APPLICATION NUMBER: 60/242,880 

; PRIOR FILING DATE: 2000-10-24 

; PRIOR APPLICATION NUMBER: 60/242,881 

; PRIOR FILING DATE: 2000-10-24 

Remaining Prior Application data removed - See File Wrapper or PALM. 

; NUMBER OF SEQ ID NOS : 160 

; SOFTWARE: Patentln Ver. 2.1 

; SEQ ID NO 44 

; LENGTH: 34 8 

; TYPE: PRT 

; ORGANISM: Mus mus cuius 
US-09-981-151A-44 

Query Match 44.0%; Score 871; DB 11; Length 348; 

Best Local Similarity 47.4%; Pred. No. 6.6e~75; 

Matches 167; Conservative 74; Mismatches 97; Indels 14; Gaps 5; 

Qy 7 S S SAC PAN S S — EEELPVGLEVHGN — LELVFTVVSTVMMGLLMFSLGCSVEI RKLWSHI 62 

: I I M I : : I : I I : I I I : I I : : : : : I I I : I I : I I : I I I 

Db 3 NSSVCPPNATVCEGDSCWPESNFNAILNTVMSTVLTILLAMVMFSMGCNVEVHKFLGHI 62 

Qy 63 RRPWGI AVGLLCQFGLMPFTAYLLAI S FSLKPVQAIAVLIMGCCPGGTI SNI FT FWVDGD 122 

: | | | | | | | | I I I I : I I I : : I : : : : I I I I : I I I I I I I I I I I I I I : I : I I I 

Db 63 KRPWGIFVGFLCQFGIMPLTGFILSVASGILPVQAVWLIMGCCPGGTGSNILAYWIDGD 122 

Qy 123 MDLSISMTTCSTVA7VLGMMPLCIYLYTWSWSLQQNLTIPYQNIGITLVCLTIPVAFGVYV 182 

I I I I : I I I I I I I : I I I I I I I I ::: I I I : I I I : I I I : I I I I I I : I I : : I 

Db 123 MDLSVSMTTCSTLLALGMMPLCLFVYTKMWVDSGTIVIPYDSIGISLVALVIPVSFGMFV 182 

Qy 183 NYRWPKQSKIILKIGAWGGVLLLWAVAGWLAKGSWNSDITLLTISFIFPLIGHVTGF 242 

| : : I I : : : I I I I I I I : : I : I : : : : I I I : I : : I : I I I I I : I : II 
Db 183 NHKWPQKAKI I LKI GS ITGVI LI VLI AVI GGI LYQSAWI I EPKLWI I GT I FP I AGYS LGF 242 

Qy 243 LLALFTHQSWQRCRTISLETGAQNIQMCITMLQLSFTAEHLVQMLSFPLAYGLFQLIDGF 302 

II I I I I t I:: I I I I I I I: I I ::l t I I: I I : =111 I : I I I • 

Db 243 FLARLAGQPWYRCRTVALETGMQNTQLCSTIVQLSFSPEDLNLVFTFPLIYTVFQLVFAA 302 

Qy 303 LIVAAYQTYKRRLKNKHGKKNSGCTEVCHTRKSTSSR ETNAFLEVNEE 350 

: I : I I I : : : I I : : I I II III : : I : 

Db 303 VILGIYVTYRK CYGKNDAEFLE — KTDNEMDSRPSFDETNKGFQPDEK 348 



RESULT 9 

US-10-288-222A-16 

; Sequence 16, Application US/10288222A 



; Publication No. US20030119742A1 
; GENERAL INFORMATION: 
; APPLICANT: Logan, Thomas Joseph 
; APPLICANT: Galvin, Katherine 
; APPLICANT: Chun, Mi young 

TITLE OF INVENTION: Methods and Compositions to treat 
; TITLE OF INVENTION: Cardiovascular Disease Using 139, 258, 1261, 1486, 2398, 
2414, 7660, 8587, 

; TITLE OF INVENTION: 10183, 10550, 12680, 17921, 32248, 60489 OR 93804 

FILE REFERENCE: MPI2001-286P1R (M) 
; CURRENT APPLICATION NUMBER: US/10/288 , 222A 
; CURRENT FILING DATE: 2002-11-05 
; NUMBER OF SEQ ID NOS : 30 

; SOFTWARE: FastSEQ for Windows Version 4.0 
; SEQ ID NO 16 

LENGTH: 349 

TYPE: PRT 

ORGANISM: Homo Sapien 
US-10-288-222A-16 

Query Match 27.9%; Score 553; DB 14; Length 349; 

Best Local Similarity 36.0%; Pred. No. 2.1e-44; 

Matches 124; Conservative 77; Mismatches 109; Indels 34; Gaps 10; 

Qy 31 ELVFTWSTVMMGLLMFS LGCSVEI RKLWSHI RRPWGI AVGLLCQFGLMPFTAYLLAI S F 90 

: | : | : I : : I I I I I : : I I : : I : : I I : I : I : I : I : I I I I : : I I 
Db 2 4 DLALSVI LVFMLFFIMLSLGCTMEFSKIKAHLWKPKGLAIALVAQYGIMPLTAFVLGKVF 8 3 

Qy 91 SLKPVQAIAVLIMGCCPGGTISNIFTFWVDGDMDLSISMTTCSTVAALGMMPLCIYLYT- 149 

M : : I : I : I : I I I I I : I I : I : : I I I : I I I I I I I I I I I I I I I I : I : I : 
Db 84 RLKNIEALAI LVCGCSPGGNLSNVFSLAMKGDMNLSIVMTTCSTFCALGMMPLLLYIYSR 143 

Qy 150 — WSWSLQQNLT I P YQNI GITLVCLT I PVAFGVYVNYRWPKQSKI I LKI GAWGGVLLLV 207 

: | : : II : I I : I I : I I I : : : I : : :: I I :: : I : 
Db 144 GI YDGDLKDK- -VP YKGI VI S LVLVLI PCT I GI VLKS KRPQ YMRYVI KGGMI I ILL 197 

Qy 208 VAVAGWLAKGSWNSDI TLLTISFIFPLIGHVTGFLL-ALFTHQSWQRC-RTIS 259 

: I I I I : : I I : I : I I I : I : : I I I I I I I I : I 

Db 198 CS VAVT VLSAINVGKS IMFAMT PLLI AT S S LMPFI GFLLGYVLSALFCLNG — RCRRTVS 255 

Qy 260 LETGAQNIQMCITMLQLSFTAEHLVQMLSFPLAYGLFQLIDGFLIVAAYQTYKRRLKNKH 319 

: I I I I I : I : I I : I : : I I : : I I I I : I I I : I I : : I : I : : I 

Db 256 METGCQNVQLCSTILNVAFPPEVIGPLFFFPLLYMIFQLGEGLLLIAIFWCYE-KFKTPK 314 

Qy 320 GKKNSGCTEVCHTRKSTSSRETNAFLEVNEEGAITPGPPGPMDC 363 

I | : : : | : | I I I : I II 

Db 315 DK TKMIYTAATT EETI PGALGNGTYKGEDC 344 



RESULT 10 
US-10-085-198-114 

; Sequence 114, Application US/10085198 

; Publication No. US20040009907A1 

; GENERAL INFORMATION: 

; APPLICANT: Alsobrook et al. 

; TITLE OF INVENTION: Proteins and Nucleic Acids Encoding Same 
; FILE REFERENCE: 21402-279 



CURRENT APPLICATION NUMBER: US/ 10/085 , 198 
CURRENT FILING DATE: 2002-02-25 
PRIOR APPLICATION NUMBER: 60/271,646 
PRIOR FILING DATE: 2001-02-26 
PRIOR APPLICATION NUMBER: 60/276,401 
PRIOR FILING DATE: 2001-03-16 
PRIOR APPLICATION NUMBER: 60/311,981 
PRIOR FILING DATE: 2001-08-13 
PRIOR APPLICATION NUMBER: 60/312,858 
PRIOR FILING DATE: 2001-08-16 
PRIOR APPLICATION NUMBER: 60/271,840 
PRIOR FILING DATE: 2001-02-27 
PRIOR APPLICATION NUMBER: 60/277,324 
PRIOR FILING DATE: 2001-03-20 
PRIOR APPLICATION NUMBER: 60/286,096 
PRIOR FILING DATE: 2001-04-21 
PRIOR APPLICATION NUMBER: 60/299,695 
PRIOR FILING DATE: 2001-06-20 
PRIOR APPLICATION NUMBER: 60/315,614 
PRIOR FILING DATE: 2001-08-29 
PRIOR APPLICATION NUMBER: 60/272,405 
PRIOR FILING DATE: 2001-02-28 

Remaining Prior Application data removed - See File Wrapper or PALM. 
NUMBER OF SEQ ID NOS : 653 
SOFTWARE: Patentln Ver. 2.1 
SEQ ID NO 114 
LENGTH: 440 
TYPE: PRT 

ORGANISM: Homo sapiens 
US-10-085-198-114 

Query Match 27.0%; Score 534.5; DB 15; Length 440; 

Best Local Similarity 37.3%; Pred. No. 1.7e-42; 

Matches 112; Conservative 60; Mismatches 113; Indels 15; Gaps 5; 

WLELVFTWSTVMMGLLMFSLGCSVEIRKLWSHIRRPWGIAVGLLCQFGLMPFTAYLL 86 
| : |:| 111:1:: : I : I I I I : I I I I I I : I I : I I 



Qy 


27 


Db 


103 


Qy 


87 


Db 


158 


Qy 


147 


Db 


218 


Qy 


204 


Db 


276 


Qy 


259 


Db 


336 



I : : | | I I : I I I : I I I I I I : I I : : I I I I I ■ I I I I I 11:11=1111 



: I : : I I I I : | | : : I : : : : I : I 



- — AGWLAKGSWNS-DITLLTISFIFPLIGHVTGFLLALFTHQSWQRCRTI 258 
I : | | : I : I I I : : I : I I I I I : 



| | | | : I I : I : I : I : I : I : : I I I I I I I I : : I hi 



RESULT 11 



US-10-093-463-22 

; Sequence 22, Application US/10093463 
; Publication No. US20030208039A1 



GENERAL INFORMATION: 



APPLICANT : 


Padigaru, 


Muralidhara 


APPLICANT : 


Shenoy, 


Suresh 


APPLICANT : 


Kekuda, 


Ramesh 


APPLICANT : 


Gusev, Vladimir 


APPLICANT : 


Pochart, 


Pascal 


APPLICANT : 


Zhong, Mei 


APPLICANT: 


Rastelli 


, Luca 


APPLICANT : 


Mezes, Peter 


APPLICANT : 


Smithson 


, Glennda 


APPLICANT: 


Guo, Xiaojia 


APPLICANT : 


Gerlach, 


Valerie 


APPLICANT : 


Casman, 


Stacie 


APPLICANT : 


Boldog, 


Ferenc 


APPLICANT : 


Li, Li 




APPLICANT 


Zerhusen 


, Bryan 


APPLICANT 


Tchernev 


, Velizar 


APPLICANT 


Gangolli 


, Esha 


APPLICANT 


Vernet, 


Corine 


APPLICANT 


: Pena, Carol 


APPLICANT 


: Burgess, 


Catherine 


APPLICANT 


: Liu, Xiaohong 


APPLICANT 


: Spytek, 


Kimberly 


APPLICANT 


: Gorman, 


Linda 


APPLICANT 


: Spaderna 


, Steven 


APPLICANT 


: Voss, Edward 


APPLICANT 


: Malyanka 


r, Uriel 


APPLICANT 


: Anderson 


, David 


APPLICANT 


: Patturaj 


an, Meera 


APPLICANT 


: Miller, 


Charles 


APPLICANT 


: Taupier, 


Raymond J. Jr. 



; TITLE OF INVENTION: No. US20030208039Alel Antibodies that Bind to Antigenic 
Polypeptides, Nucleic Acids 

; TITLE OF INVENTION: Encoding The Antigens, and Methods of Use. 

FILE REFERENCE: 21402-290A (Cura 590AT) 
; CURRENT APPLICATION NUMBER: US/10/093, 463 
; CURRENT FILING DATE: 2002-06-24 
; PRIOR APPLICATION NUMBER: 60/283,675 

PRIOR FILING DATE: 2001-04-14 

PRIOR APPLICATION NUMBER: 60/338,092 
; PRIOR FILING DATE: 2001-12-03 
; PRIOR APPLICATION NUMBER: 60/274,281 
; PRIOR FILING DATE: 2001-03-08 
; PRIOR APPLICATION NUMBER: 60/274,101 
; PRIOR FILING DATE: 2001-03-08 
; PRIOR APPLICATION NUMBER: 60/325,681 
; PRIOR FILING DATE: 2001-09-27 

PRIOR APPLICATION NUMBER: 60/304,354 
; PRIOR FILING DATE: 2001-07-10 
; PRIOR APPLICATION NUMBER: 60/279,995 
; PRIOR FILING DATE: 2001-03-30 

PRIOR APPLICATION NUMBER: 60/294,899 
; PRIOR FILING DATE: 2001-05-31 
; PRIOR APPLICATION NUMBER: 60/287,424 



; PRIOR FILING DATE: 2001-04-30 

; PRIOR APPLICATION NUMBER: 60/299,027 

; PRIOR FILING DATE: 2001-06-18 

; PRIOR APPLICATION NUMBER: 60/309,198 

; PRIOR FILING DATE: 2001-07-31 

; PRIOR APPLICATION NUMBER: 60/281,194 

; PRIOR FILING DATE: 2001-04-04 

; PRIOR APPLICATION NUMBER: 60/274,194 

PRIOR FILING DATE: 2001-03-08 
; PRIOR APPLICATION NUMBER: 60/274,849 
; PRIOR FILING DATE: 2001-03-09 
; PRIOR APPLICATION NUMBER: 60/330,380 
; PRIOR FILING DATE: 2001-10-18 
; PRIOR APPLICATION NUMBER: 60/275,235 
; PRIOR FILING DATE: 2001-03-12 
; PRIOR APPLICATION NUMBER: 60/288,342 
; PRIOR FILING DATE: 2001-05-03 
; PRIOR APPLICATION NUMBER: 60/275,578 
; PRIOR FILING DATE: 2001-03-13 
; NUMBER OF SEQ ID NOS : 37 0 

SOFTWARE: Patentln Ver. 2.1 
; SEQ ID NO 22 
LENGTH: 367 
TYPE: PRT 
; ORGANISM: Homo sapiens 
US-10-093-463-22 

Query Match 16.3%; Score 322.5; DB 15; Length 367; 

Best Local Similarity 26.9%; Pred. No. 2.8e-22; 



Matches 79; Conservative 71; Mismatches 131; Indels 13; Gaps 5; 

Qy 2 6 VHGNLELVFTWSTVMMGLLMFSLGCSVEIRKLWSHIRRPWGIAVGLLCQFGLMPFTAYL 8 5 

: | : : : : : : : : I I I : I : : : : I I : : I : I I I I I I : I 

Db 66 MHIDRNILMLILPLILLNKCAF — GCKIELQLFQTVWKRPLPVILGAVTQFFLMPFCGFL 123 

Qy 86 LAISFSLKPVQAIAVLIMGCCPGGTISNIFT FWVDGDMDLSISMTTCSTVAALGMMPLCI 145 

I : : | II I : : I I II : I : I I I I : I I I I I : I I I I I :. 

Db 124 LSQIVALPEAQAFGVWTCTCPGGGGGYLFALLLDGDFTLAILMTCTSTLLALIMMPVNS 18 3 

Qy 146 YLYTWSWSLQQNLTIPYQNIGITLVCLTIPVAFGVYVNYRWPKQSKIILKIGAWGGVLL 205 

I : I : | II I | | : : : | | : | : : : | | : : : : : I : : I : 

Db 184 YIYSRILGLSGTFHIPVSKIVSTLLFILVPVSIGIVIKHRIPEKASFLERIIRPLSFILM 243 

Qy 206 LV----VAVAGWLAKGSWNSDITLLTISFIFPLIGHVTGFLLALFTHQSWQRCRTISLE 261 

| | : | | :::::: I : I : I : I I : I : : : I 

Db 244 FVGIYLTFTVGLVFLK T DN L E VI L L G L L VP AL G L L F G Y S FAKVCT L P L P VC KT VAI E 300 

Qy 262 TGAQNIQMCITMLQLSF— TAEHLVQMLSFPLAYGLFQLIDGFLIVAAYQTYKR 313 

: | | : : : : | | | | : : I : I : I : : I I : I : II 

Db 301 SGMLNSFLALAVIQLSFPQSKANLASVAP FTVA — MCSGCEMLLIILVYKAKKR 352 



RESULT 12 
US-10-093-463-26 

; Sequence 26, Application US/10093463 
; Publication No. US20030208039A1 
; GENERAL INFORMATION: 



APPLICANT: Padigaru, Muralidhara 
APPLICANT: Shenoy, Suresh 
APPLICANT: Kekuda, Ramesh 
APPLICANT: Gusev, Vladimir 
APPLICANT: Pochart, Pascal 
APPLICANT: Zhong, Mei 
APPLICANT: Rastelli, Luca 
APPLICANT: Mezes, Peter 
APPLICANT: Smithson, Glennda 
APPLICANT: Guo, Xiaojia 
APPLICANT: Gerlach, Valerie 
APPLICANT: Casman, Stacie 
APPLICANT: Boldog, Ferenc 
APPLICANT: Li, Li 
APPLICANT: Zerhusen, Bryan 
APPLICANT: Tchernev, Velizar 
APPLICANT: Gangolli, Esha 
APPLICANT: Vernet, Corine 
APPLICANT: Pena, Carol 
APPLICANT: Burgess , Catherine 
APPLICANT: Liu, Xiaohong 
APPLICANT: Spytek, Kimberly 
APPLICANT: Gorman, Linda 
APPLICANT: Spaderna, Steven 
APPLICANT: Voss, Edward 
APPLICANT: Malyankar, Uriel 
APPLICANT: Anderson, David 
APPLICANT: Patturajan, Meera 
APPLICANT: Miller, Charles 
APPLICANT: Taupier, Raymond J. Jr. 

TITLE OF INVENTION: No. US2 003 0208 039Alel Antibodies that Bind to Antigenic 
Polypeptides, Nucleic Acids 

TITLE OF INVENTION: Encoding The Antigens, and Methods of Use. 
FILE REFERENCE: 21402-290A (Cura 590AT) 
CURRENT APPLICATION NUMBER: US/10/093,463 
CURRENT FILING DATE: 2002-06-24 
PRIOR APPLICATION NUMBER: 60/283,675 
PRIOR FILING DATE: 2001-04-14 
PRIOR APPLICATION NUMBER: 60/338,092 
PRIOR FILING DATE: 2001-12-03 
PRIOR APPLICATION NUMBER: 60/274,281 
PRIOR FILING DATE: 2001-03-08 
PRIOR APPLICATION NUMBER: 60/274,101 
PRIOR FILING DATE: 2001-03-08 
PRIOR APPLICATION NUMBER: 60/325,681 
PRIOR FILING DATE: 2001-09-27 
PRIOR APPLICATION NUMBER: 60/304,354 
PRIOR FILING DATE: 2001-07-10 
PRIOR APPLICATION NUMBER: 60/279,995 
PRIOR FILING DATE: 2001-03-30 
PRIOR APPLICATION NUMBER: 60/294,899 
PRIOR FILING DATE: 2001-05-31 
PRIOR APPLICATION NUMBER: 60/287,424 
PRIOR FILING DATE: 2001-04-30 
PRIOR APPLICATION NUMBER: 60/299,027 
PRIOR FILING DATE: 2001-06-18 
PRIOR APPLICATION NUMBER: 60/309,198 



; PRIOR FILING DATE: 2001-07-31 

; PRIOR APPLICATION NUMBER: 60/281,194 

; PRIOR FILING DATE: 2001-04-04 

; PRIOR APPLICATION NUMBER: 60/274,194 

; PRIOR FILING DATE: 2001-03-08 

; PRIOR APPLICATION NUMBER: 60/274,849 

; PRIOR FILING DATE: 2001-03-09 

; PRIOR APPLICATION NUMBER: 60/330,380 

; PRIOR FILING DATE: 2001-10-18 

; PRIOR APPLICATION NUMBER: 60/275,235 

; PRIOR FILING DATE: 2001-03-12 

; PRIOR APPLICATION NUMBER: 60/288,342 

; PRIOR FILING DATE: 2001-05-03 

; PRIOR APPLICATION NUMBER: 60/275,578 

; PRIOR FILING DATE: 2001-03-13 

; NUMBER OF SEQ ID NOS : 370 

; SOFTWARE: Patentln Ver. 2.1 

; SEQ ID NO 26 

LENGTH: 367 
; TYPE: PRT 

ORGANISM: Homo sapiens 
US-10-093-463-26 

Query Match 16.3%; Score 322.5; DB 15; Length 367; 

Best Local Similarity 26.9%; Pred. No. 2.8e~22; 

Matches 79; Conservative 71; Mismatches 131; Indels 13; Gaps 5; 

Qy 26 VHGNLELVFTWSTVMMGLLMFSLGCSVEIRKLWSHIRRPWGIAVGLLCQFGLMPFTAYL 85 

: I : : : : : : : : I • I I : I : : : : I I : : I : I I I I I I : I 
Db 66 MHI DRNI LMLI LPLI LLNKCAF — GCKIELQLFQTVWKRPLPVILGAVTQFFLMPFCGFL 123 

Qy 86 LAI S FSLKPVQAIAVLIMGCCPGGTI SNI FTFWVDGDMDLSI SMTTCSTVAALGMMPLCI 145 

I : : I II I : : I I II : I : I I I hill I I : I I I I I : 
Db 124 LSQIVALPE^QAFGVWTCTCPGGGGGYLFALLLDGDFTLAILMTCTSTLIALIMMPVNS 183 

Qy 146 YLYTWSWSLQQNLTIPYQNIGITLVCLTIPVAFGVYVNYRWPKQSKIILKIGAWGGVLL 205 

I : I : I II I | | : : : | | : I : : : I I : : : : : I : : I : 

Db 184 YIYSRILGLSGTFHIPVSKIVSTLLFILVPVSIGI VI KHRIPEKASFLERIIRPLSFILM 243 

Qy 206 LV VAVAGWLAKGSWNSDITLLTISFIFPLIGHWGFLIALFTHQSWQRCRTISLE 261 

I I : I I :::::: | : | : | : | | : | :: : | 

Db 244 FVG I YLT FTVGLVFLK TDNLEVILLGLLVPALGLLFGYSFAKVCTLPLPVCKTVAIE 300 

Qy 2 62 TGAQNIQMCITMLQLSF— TAEHLVQMLSFPLAYGLFQLIDGFLIVAAYQTYKR 313 

: I I : : : : I I I I : : I : I : I : : I I : I : II 

Db 301 SGMLNSFIJVLAVIQLSFPQSKANLASVAPFTVA — MCSGCEMLLI ILVYKAKKR 352 



RESULT 13 
US-10-091-628-5 

; Sequence 5, Application US/10091628 
; Publication No. US2002 0164627A1 
; GENERAL INFORMATION: 

; APPLICANT: Wilganowski, Nathaniel L. 
; APPLICANT: Nepomnichy, Boris 
; APPLICANT: Burnett, Michael B. 
; APPLICANT: Hu, Yi 



; TITLE OF INVENTION: No. US20020164627Alel Human Transporter Proteins and 
Polynucleotides Encoding the 
TITLE OF INVENTION: Same 
; FILE REFERENCE: LEX-0314-USA 
; CURRENT APPLICATION NUMBER: US/10/091,628 
; CURRENT FILING DATE: 2002-03-06 
; PRIOR APPLICATION NUMBER: US 60/275,009 
; PRIOR FILING DATE: 2001-03-12 
; PRIOR APPLICATION NUMBER: US 60/284,152 
; PRIOR FILING DATE: 2001-04-17 
; NUMBER OF SEQ ID NOS : 6 

; SOFTWARE : FastSEQ for Windows Version 4.0 
; SEQ ID NO 5 
; LENGTH: 438 

TYPE: PRT 
; ORGANISM: Homo sapiens 
US-10-091-628-5 

Query Match 16.3%; Score 322.5; DB 13; Length 438; 

Best Local Similarity 26.9%; Pred. No. 3.5e-22; 

Matches 79; Conservative 71; Mismatches 131; Indels 13; Gaps 5; 

Qy 2 6 VHGNLELVFTWSTVMMGLLMFSLGCSVEIRKLWSHIRRPWGIAVGLLCQFGLMPFTAYL 85 

: | : : : : : : : : I I I : I : : : : I I : : I : I I I I I I : I 

Db 137 MHIDRNILMLILPLILLNKCAF — GCKIELQLFQTVWKRPLPVILGAVTQFFLMPFCGFL 194 

Qy 86 LAISFSLKPVQAIAVLIMGCCPGGTISNIFTFWVDGDMDLSISMTTCSTVAALGMMPLCI 145 

I : : | || I : : I I I I : I : I I I hill I I : I I I I I : 

Db 195 LSQIVALPEAQAFGWMTCTCPGGGGGYLFALLLDGDFTLAI LMTCTSTLLALIMMPVNS 254 

Qy 146 YLYTWSWSLQQNLTI PYQNI GITLVCLTI PVAFGVYVNYRWPKQSKI ILKI GAWGGVLL 205 

I : I : I II I I I : : : I I : I : : : I I : : : : : I : ' I ' 

Db 255 YIYSRILGLSGTFHIPVSKIVSTLLFILVPVSIGIVIKHRIPEKASFLERIIRPLSFILM 314 

Qy 206 LV VAVAGWLAKGSWNSDITLLTISFIFPLIGHVTGFLLALFTHQSWQRCRTISLE 261 

| I : I I :::::: I : I : I : I I : I : : : I 

Db 315 FVGI YLT FTVGLVFLK TDNLEVI LLGLLVPALGLLFGYS FAKVCTLPLPVCKTVAI E 371 

Qy 262 TGAQNIQMCITMLQLSF— TAEHLVQMLSFPLAYGLFQLIDGFLIVAAYQTYKR 313 

: I | : : : : | | | | : : I : I : I : : I I : I : II 

Db 372 S GMLN S FLALAVI QLS FPQS KANLAS VAP FTVA — MCS GCEMLLI I LVYKAKKR 423 



RESULT 14 
US-10-093-463-24 

Sequence 24, Application US/10093463 
Publication No. US20030208039A1 
GENERAL INFORMATION: 
APPLICANT: Padigaru, Muralidhara 
APPLICANT: Shenoy, Suresh 
APPLICANT: Kekuda, Ramesh 
APPLICANT: Gusev, Vladimir 
APPLICANT: Pochart, Pascal 
APPLICANT: Zhong, Mei 
APPLICANT: Rastelli, Luca 
APPLICANT: Mezes, Peter 
APPLICANT: Smithson, Glennda 



Ar r L 1 CAN 1 . 


Guo, Xiaojia 


Ti DDT TfAHT 1 • 

Ar r LI CAN 1 . 


>~"| r-Vi 
uci X d i^I 1 , 


v aier le 


A PPT TPZ1MT • 




Stacie 


ArrLlCAN i I 


Boldog, 


Ferenc 


APPLICANI : 


T -i T -i 

LI , lil 




ArrLlCAN 1 : 


Zerhusen 


, Bryan 


7\r>r>T t r< 7\ mt> . 
Ar r L1CAN 1 : 


Tchernev 


, Velizar 


Ar rlil CAN J. . 


Gangolli 


, Esha 


APPLICANT : 


Vernet, 


Corine 


AFFL1CAN 1 : 


Pena, Ca 


rol 


APPLICANT : 


Burgess , 


Catherine 


APPLICAN I : 


Liu, Xiaohong 


APPLICANT : 


Spytek, 


Kimberly 


APPLICANT : 


Gorman, 


Linda 


APPLICANT : 


Spaderna 


, Steven 


ArrLlCAN 1 : 


Voss, Edward 


APPLICANT: 


Malyankar, Uriel 


APPLICANT: 


Anderson 


, David 


APPLICANT: 


Pattura j 


an, Meera 


APPLICANT: 


Miller, 


Charles 


APPLICANT: 


Taupier, 


Raymond J. Jr 



; TITLE OF INVENTION: No. US20030208039Alel Antibodies that Bind to Antigenic 
Polypeptides, Nucleic Acids 

TITLE OF INVENTION: Encoding The Antigens, and Methods of Use. 

; FILE REFERENCE: 21402-290A (Cura 590AT) 

; CURRENT APPLICATION NUMBER: US/10/093, 463 

; CURRENT FILING DATE: 2002-06-24 

; PRIOR APPLICATION NUMBER: 60/283,675 

; PRIOR FILING DATE: 2001-04-14 

; PRIOR APPLICATION NUMBER: 60/338,092 

; PRIOR FILING DATE: 2001-12-03 

; PRIOR APPLICATION NUMBER: 60/274,281 

; PRIOR FILING DATE: 2001-03-08 

; PRIOR APPLICATION NUMBER: 60/274,101 

; PRIOR FILING DATE: 2001-03-08 

; PRIOR APPLICATION NUMBER: 60/325,681 

; PRIOR FILING DATE: 2001-09-27 

PRIOR APPLICATION NUMBER: 60/304,354 

; PRIOR FILING DATE: 2001-07-10 

; PRIOR APPLICATION NUMBER: 60/279,995 

; PRIOR FILING DATE: 2001-03-30 

; PRIOR APPLICATION NUMBER: 60/294,899 

; PRIOR FILING DATE: 2001-05-31 

; PRIOR APPLICATION NUMBER: 60/287,424 

PRIOR FILING DATE: 2001-04-30 

; PRIOR APPLICATION NUMBER: 60/299,027 

; PRIOR FILING DATE: 2001-06-18 

; PRIOR APPLICATION NUMBER: 60/309,198 

; PRIOR FILING DATE: 2001-07-31 

; PRIOR APPLICATION NUMBER: 60/281,194 

; PRIOR FILING DATE: 2001-04-04 

; PRIOR APPLICATION NUMBER: 60/274,194 

; PRIOR FILING DATE: 2001-03-08 

; PRIOR APPLICATION NUMBER: 60/274,849 

; PRIOR FILING DATE: 2001-03-09 

; PRIOR APPLICATION NUMBER: 60/330,380 

; PRIOR FILING DATE: 2001-10-18 



; PRIOR APPLICATION NUMBER: 60/275,235 

; PRIOR FILING DATE: 2001-03-12 

; PRIOR APPLICATION NUMBER: 60/288,342 

; PRIOR FILING DATE: 2001-05-03 

; PRIOR APPLICATION NUMBER: 60/275,578 

; PRIOR FILING DATE: 2001-03-13 

; NUMBER OF SEQ ID NOS : 37 0 

; SOFTWARE: Patentln Ver. 2.1 

; SEQ ID NO 24 

LENGTH: 438 

TYPE: PRT 
; ORGANISM: Homo sapiens 
US-10-093-463-24 

Query Match 16.3%; Score 322.5; DB 15; Length 438; 

Best Local Similarity 26.9%; Pred. No. 3.5e-22; 

Matches 79; Conservative 71; Mismatches 131; Indels 13; Gaps 5; 

Qy 26 VHGNLELVFT WSTVMMGLLMFS LGCSVEI RKLWSHI RRPWGI AVGLLCQFGLMPFTAYL 85 

: | : : : : : : : : I I I : I : : : : I I : : I : I I I I I I : I 
Db 137 MHIDRNILMLILPLILLNKCAF— GCKIELQLFQTVWKRPLPVILGAVTQFFLMPFCGFL 194 

Qy 86 LAISFSLKPVQAIAVLIMGCCPGGTISNIFTFWVDGDMDLSISMTTCSTVAALGMMPLCI 145 

I : : I II I : : I II I : I : I I I hill 11= II 111 = 

Db 195 LSQIVALPEAQAFGWMTCTCPGGGGGYLFALLLDGDFTLAI LMTCTSTLLALIMMPVNS 254 

Qy 146 YLYTWSWSLQQNLTIPYQNIGITLVCLTIPVAFGVYVNYRWPKQSKIILKIGAWGGVLL 205 

|:|: I II I | | : : : I |: I : : : I h : : : : I : : h 

Db 255 YIYSRILGLSGTFHIPVSKIVSTLLFILVPVSIGIVIKHRIPEKASFLERIIRPLSFILM 314 

Qy 206 LV VAVAGWLAKGSWNSDITLLTISFIFPLIGHVTGFLLALFTHQSWQRCRTISLE 261 

| I : I I :::::: | : | : | : I I : I : : : I 

Db 315 FVGIYLTFTVGLVFLK TDNLEVILLGLLVPALGLLFGYSFAKVCTLPLPVCKTVAIE 371 

Qy 262 TGAQNIQMCITMLQLSF— TAEHLVQMLSFPLAYGLFQLIDGFLIVAAYQTYKR 313 

: I | : : : : | | | | : : | : I : I : : I I : I : II 

Db 372 S GMLN S FLALAVI QL S FPQ S KAN LAS VAP FT VA — MCSGCEMLLIILVYKAKKR 423 



RESULT 15 

US-10-108-260A-3362 

; Sequence 3362, Application US/10108260A 
; Publication No. US20040005560A1 
; GENERAL INFORMATION: 

; APPLICANT: HELIX RESEARCH INSTITUTE 

; TITLE OF INVENTION: No. US2 004 0005560Alel full length cDNA 
; FILE REFERENCE: H1-A0106 

; CURRENT APPLICATION NUMBER: US/10/108 , 260A 
; CURRENT FILING DATE: 2002-03-27 
; NUMBER OF SEQ ID NOS: 5458 
; SOFTWARE: Patentln Ver. 2.1 
; SEQ ID NO 3362 

LENGTH: 438 

TYPE: PRT 

ORGANISM: Homo sapiens 
US-10-108-260A-3362 



Query Match 16.3%; Score 322.5; DB 15; Length 438; 

Best Local Similarity 26.9%; Pred. No. 3.5e-22; 

Matches 79; Conservative 71; Mismatches 131; Indels 13; Gaps 5; 



Qy 26 VliGNLELVFTWSTWIMGLLMFSLGCSVEIRKLWSHIRRPWGIAVGLLCQFGLMPFTAYL 85 

: I : : : : : : : : I I I : I : : : : I I : : I : I I I I I I : I 

Db 137 MHI DRNI LMLI LPLI LLNKCAF — GCKI ELQLFQTWKRPLPVI LGAVTQFFLMPFCGFL 194 

Qy 8 6 LAI S FSLKPVQAIAVLIMGCCPGGTI SNI FTFWVDGDMDLS I SMTTCSTVAALGMMPLCI 145 

I : : I II I : : I I I I : I : I I I hill I I : I I I I I : 

Db 195 LSQIVALPEAQAFGVA^TCTCPGGGGGYLFALLLDGDFTLAILMTCTSTLLALIMMPVNS 254 

Qy 146 YLYTWSWSLQQNLTIPYQNIGITLVCLTIPVAFGVYVNYRWPKQSKIILKIGAWGGVLL 205 

I : I : I II I I I : : : I I : I : : : I I : : : : : I : : I : 

Db 255 YIYSRILGLSGT FHIPVSKIVSTLLFILVPVSIGIVI KHRIPEKASFLERIIRPLSFILM 314 

Qy 206 LV VAVAGVVLAKGSWNSDITLLTI SFI FPLIGHVTGFLLALFTHQSWQRCRTI S LE 261 

| | : I | :::::: | : | : | : | | : I : : : I 

Db 315 FVGI YLT FTVGLVFLK TDNLEVI LLGLLVPALGLLFGYS FAKVCTLPLPVCKTVAI E 371 

Qy 262 TGAQNIQMCITMLQLSF— TAEHLVQMLSFPLAYGLFQLIDGFLIVAAYQTYKR 313 

: | I : : : : I I I I : : I : I : I : : I I : I : I I 
Db 372 S GMLN S FLALAVI Q L S F P Q S KAN LAS VAP FT VA — MC S GC EML L 1 1 L VYKAKKR 423 



Search completed: March 23, 2004, 14:39:23 
Job time : 47 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM protein - protein search, using sw model 
Run on: 



Title: 

Perfect score: 
Sequence : 

Scoring table: 



Searched: 



March 23, 2004, 14:32:47 ; Search time 45 Seconds 

(without alignments) 
2643.341 Million cell updates/sec 

US-10-091-628-2 
1979 

1 MRANCSSSSACPANSSEEEL PGPMDCHRALEPVGHITSCE 377 

BLOSUM62 

Gapop 10.0 , Gapext 0.5 

1017041 seqs, 315518202 residues 



1017041 



Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 

Database : SPTREMBL_25 : * 

1: sp_archea:* 

2: sp_bacteria : * 

3: sp_fungi:* 

4: sp_human:* 

5: sp_invertebrate : * 

6 : sp__mammal : * 

7: sp_mhc:* 

8: sp_organelle : * 

9: sp_phage:* 
10: sp_plant:* 
11: sp_rodent : * 
12: sp_virus:* 
13: sp__vertebrate : * 
14: sp__unclassif ied: * 
15: sp_rvirus : * 
16: sp bacteriap: * 
17: sp_archeap:* 

Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 



SUMMARIES 
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ALIGNMENTS 



RESULT 1 
Q9CXB2 

ID Q9CXB2 PRELIMINARY 
AC Q9CXB2; 

DT 01-JUN-2001 (TrEMBLrel. 
DT 01-JUN-2001 (TrEMBLrel. 
DT 01-JUN-2003 (TrEMBLrel. 



PRT; 373 AA. 
17, Created) 

17, Last sequence update) 
24, Last annotation update) 



DE 8430417Gl7Rik protein. 

GN 8430417G17RIK. 

OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 

OX NCBI_TaxID=10090; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=C57BL/6J; TISSUE=Embryonic lung; 

RX MEDLINE=21085660; PubMed=11217851 ; 

RA Kawai J., Shinagawa A. , Shibata K., Yoshino M., Itoh M. , Ishii Y . , 

RA Arakawa T. f Hara A. , Fukunishi Y., Konno H., Adachi J., Fukuda S., 

RA Aizawa K. , Izawa M. , Nishi K., Kiyosawa H., Kondo S., Yamanaka I., 

RA Saito T., Okazaki Y. , Gojobori T . , Bono H., Kasukawa T. , Saito R. , 

RA Kadota K., Matsuda H.A. , Ashburner M. , Batalov S., Casavant T., 

RA Fleischmann W., Gaasterland T., Gissi C, King B., Kochiwa H. , 

RA Kuehl P., Lewis S., Matsuo Y. , Nikaido I., Pesole G. , Quackenbush J., 

RA Schriml L.M., Staubli F. , Suzuki R., Tomita M. , Wagner L. , Washio T., 

RA Sakai K., Okido T., Furuno M. , Aono H., Baldarelli R. , Barsh G. , 

RA Blake J., Boffelli D., Bojunga N., Carninci P., de Bonaldo M. F. , 

RA Brownstein M.J., Bult C, Fletcher C, Fujita M. , Gariboldi M. , 

RA Gustincich S., Hill D . , Hofmann M. , Hume D.A., Kamiya M., Lee N.H., 

RA Lyons P., Marchionni L. , Mashima J., Mazzarelli J., Mombaerts P., 

RA Nordone P . , Ring B., Ringwald M. , Rodriguez I., Sakamoto N., 

RA Sasaki H., Sato K. , Schoenbach C, Seya T., Shibata Y., Storch K.-F. f 

RA Suzuki H. f Toyo-oka K. , Wang K.H., Weitz C. f Whittaker C, Wilming L. , 

RA Wynshaw-Boris A. , Yoshida K., Hasegawa Y., Kawaji H. f Kohtsuki S. f 

RA Hayashizaki Y. ; 

RT "Functional annotation of a full-length mouse cDNA collection."; 

RL Nature 409:685-690(2001). 

DR EMBL; AK018423; BAB31203.1; -. 

DR MGD; MGI : 1923000; 8430417G17Rik. 

DR GO; GO: 0016020; C:membrane; IEA. 

DR GO; GO: 0008508; F:bile acid: sodium symporter activity; IEA. 

DR GO; GO: 0006814; P: sodium ion transport; IEA. 

DR InterPro; IPR002657; BilAc/Na_symport . 

DR Pfam; PF01758; SBF; 1. 

SQ SEQUENCE 373 AA; 40681 MW; 0902D1850 6A8AC55 CRC64 ; 



Query Match 71.5%; Score 1415; DB 11; Length 373; 

Best Local Similarity 70.3%; Pred. No. 2.9e-108; 

Matches 265; Conservative 50; Mismatches 58; Indels 4; Gaps 2; 



Qy 1 MRANCSSSSACPANSSEEELPVGLEVHGNLELVFTWSTVMMGLLMFSLGCSVEIRKLWS 60 

I : i : : 1 I I I I : I I : I I I : I I I I : I : I I I : I I I : I I : I I I I I I I I : I I I 

Db 1 MSTDCAGNSTCPWSTEEDPPVGMEGHANLKLLFTVLSAVMVGLVMFSFGCSVESQKLWL 60 

Qy 61 HIRRPWGIAVGLLCQFGLMPFTAYLLAISFSLKPVQAIAVLIMGCCPGGTISNIFTFWVD 120 

I : I I I I I I I I I I I I I I I I I I I I I II I I III lllllhll MINIM: I I f I I 

Db 61 HLRRPWGIAVGLLSQFGLMPLTAYLLAIGFGLKPFQAIAVLMMGSCPGGTISNVLTFWVD 120 

Qy 121 GDMDLS I SMTTCSTVAALGMMPLCI YLYTWSWSLQQNLTI PYQNI GITLVCLTI PVAFGV 180 

I I I I I I I II I II I I I I I II M I I I : I : I I II : I III I I I I : II M M I MM II 

Db 121 GDMDLSISMTTCSTVAALGMMPLCLYIYTRSWTLTQNLVIPYQSIGITLVSLWPVASGV 180 

Qy 181 YVN YRWP KQ S K 1 1 LK I GAVVGGVLL LVVAVAGVVIAKG S WN SDITLLTISFIFPLI GHVT 240 

I M I I I II I : M I I M I : : I I : I II I I II I M I I I I I I M M M II I I M : I I M 



Db 



181 YWYRWPKQATVILKVGAILGGMLLLWAVTG1WLAKG-WNTDVTLLVISCIFPLVGHVT 239 



Qy 241 GFLLALFTHQSWQRCRTISLETGAQNIQMCITMLQLSFTAEHLVQMLSFPLAYGLFQLID 300 

I I I II I I I I I I I I I I I I : I I I I I I I I : I I I I I i I I : I I : I I I : I : I I I i I I I I : = 

Db 24 0 GFLLAFLTHQSWQRCRTISIETGAQNIQLCIAMLQLSFS7VEYLVQLLNFALAYGLFQVLH 299 

Qy 301 GFLIVAAYQTYKRRLKNKHGKKNSGCTEVCHTRKSTSSRETNAFLEVNEEGAITPGPPGP 360 

I I I I I I I I I I I I I : I : : : I : I I : : : I I I : I I I : : I I : I I I I 
Db 300 GLLIVAAYQAYKRRQKSKCRRQHPDCPDVCYEKQ PRETSAFLDKGDEAAVTLGPVQP 356 

Qy 361 MDCHRALEPVGHITSCE 377 



357 EQHHRAAELTSHIPSCE 373 



RESULT 2 
Q7T3A9 

ID Q7T3A9 PRELIMINARY; PRT; 361 AA. 

AC Q7T3A9; 

DT 01-OCT-2003 (TrEMBLrel. 25, Created) 

DT 01-OCT-2003 (TrEMBLrel. 25, Last sequence update) 

DT 01-OCT-2003 (TrEMBLrel. 25, Last annotation update) 

DE Hypothetical protein. 

OS Brachydanio rerio (Zebrafish) (Danio rerio) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Actinopterygii; Neopterygii; Teleostei; Ostariophysi ; Cyprinif ormes ; 

OC Cyprinidae; Danio. 

OX NCBI_TaxID=7955; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Kidney; 

RX MEDLINE=22388257 ; PubMed=12477 932 ; 

RA Strausberg R.L., Feingold E.A., Grouse L.H., Derge J.G., 

RA Klausner R.D., Collins F.S., Wagner L . , Shenmen CM., Schuler G.D., 

RA Altschul S.F., Zeeberg B., Buetow K.H., Schaefer C.F., Bhat N.K., 

RA Hopkins R.F., Jordan H., Moore T . , Max S.I., Wang J., Hsieh F. , 

RA Diatchenko L., Marusina K., Farmer A. A., Rubin G.M., Hong L., 

RA Stapleton M. , Soares M.B., Bonaldo M.F., Casavant T.L., Scheetz T.E., 

RA Brownstein M.J., Usdin T.B., Toshiyuki S., Carninci P., Prange C, 

RA Raha S.S., Loquellano N.A., Peters G.J., Abramson R.D., Mullahy S.J., 

RA Bosak S.A., McEwan P. J., McKernan K.J., Malek J. A. , Gunaratne P.H., 

RA Richards S., Worley K.C., Hale S., Garcia A.M., Gay L.J., Hulyk S.W., 

RA Villalon D.K., Muzny D.M., Sodergren E.J., Lu X., Gibbs R.A., 

RA Fahey J., Helton E. , Ketteman M., Madan A., Rodrigues S., Sanchez A., 

RA Whiting M. , Madan A., Young A.C., Shevchenko Y. , Bouffard G.G., 

RA Blakesley R.W., Touchman J.W. , Green E.D., Dickson M.C., 

RA Rodriguez A.C., Grimwood J., Schmutz J., Myers R.M. , Butterfield Y.S., 

RA Krzywinski M.I., Skalska U., Smailus D.E., Schnerch A., Schein J.E., 

RA Jones S.J., Marra M.A. ; 

RT "Generation and initial analysis of more than 15,000 full-length human 

RT and mouse cDNA sequences."; 

RL Proc. Natl. Acad. Sci. U.S.A. 99:16899-16903(2002). 

RN [2] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Kidney; 

RA Strausberg R. ; 

RL Submitted (JUN-2003) to the EMBL/GenBank/DDBJ databases. 



DR EMBL; BC053189; AAH53189.1; 
KW Hypothetical protein. 

SQ SEQUENCE 361 AA; 39284 MW; 5729C89AEBABC323 CRC64; 

Query Match 44.6%; Score 883; DB 13; Length 361; 

Best Local Similarity 48.5%; Pred. No. 1.5e-64; 

Matches 172; Conservative 62; Mismatches 107; Indels 14; Gaps 5; 

Qy 5 CSSSSACPANSSEEELPVGLE VHGNLELVFTWSTVMMGLLMFSLGCSVEIRK 57 

I : III:: : I : I : I : I I I I : : : I I I : I I : I I I I 

Db 2 CTLEPVCPVNAT 1 CTGT S CLVPRDP FNDI LS WMSVAI TVMLAMVMFSMGCTVEARK 58 

Qy 58 LWSHIRRPWGIAVGLLCQFGLMPFTAYLLAISFSLKPVQAIAVLIMGCCPGGTISNIFTF 117 

II hllll II :| llllhll I I I:: I:: |:: MM: ::|lllllll: I hi : 

Db 59 LWGHVRRPWGIFIGFLCQFGIMPFTAFILSLLFNVLPVQAWIIIMGCCPGGSSSNVFCY 118 

Qy 118 WVDGDMDLS I SMTTCSTVAALGMMPLCI YLYTWSWSLQQNLTI PYQNIGITLVCLTI PVA 177 

I : I I I I I I I I I I I I I :: I I I I I I I I :: I I I : : I I I I I I I I I I I : I I 
Db 119 WLDGDMDLSISMTACSSILALGMMPLCLLIYTTIWTAGDAIQIPYDNIGITLVSLLVPVG 178 

Qy 178 FGVYVN YRWP KQ S KI I LKI GAVVGGVLLLVVAVAGVVLAKGSWNS D I TLLT ISFIFPLIG 237 

I : I : : I I I : I I I I : I : I I I I I : : I : II I I I : II : I I hill 
Db 17 9 LGMLVKHKWPKAAKKI LKVGS WGI VLI I VI AVI GGVLYQS SWT I AP S LWI I GT I YP FI G 238 

Qy 238 HVTGFLLALFTHQSWQRCRTISLETGAQNIQMCITMLQLSFTAEHLVQMLSFPLAYGLFQ 297 

I I I I I I I I I I I I I: I I I I I I h I : Mil: I I : I I I I : I I 
Db 239 FGLGFLLARFVGQPWHRCRTIALETGMQNAQLASTITQLSFSPAELEVMFAFPLIYSIFQ 298 

Qy 298 LIDGFLIVAAYQTYKR-RLKNKHGKKNSGCTEVCHTRKSTSSRETNAFLEVNEEG 351 

I : : |: : : I I I : : I I I I I : I I I : I I 

Db 299 LWAGIAVSIHYSIKRCRHQTLVEEDGEGTTEDCD— KHSYSLENGGF-SCDENG 350 



RESULT 3 
P70172 

ID P70172 PRELIMINARY; PRT; 348 AA. 

AC P70172; 

DT 01-FEB-1997 (TrEMBLrel. 02, Created) 

DT 01-JAN-1998 (TrEMBLrel. 05, Last sequence update) 

DT 01-JUN-2003 (TrEMBLrel. 24, Last annotation update) 

DE Ileal NA+-dependent bile acid transporter (ISBT) . 

GN SLC10A2. 

OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 

OX NCBI_TaxID=10090; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=ICR; 

RA Saeki T., Matoba K. ; 

RL Submitted (APR-1997) to the EMBL/ GenBank/DDBJ databases. 

RN [2] 

RP SEQUENCE OF 9-332 FROM N.A. 

RC STRAIN=ICR; 

RA Saeki T. ; 

RL Submitted (AUG-1996) to the EMBL/ GenBank/DDBJ databases. 

DR EMBL; AB002693; BAA19606.1; -. 



DR EMBL; D87059; BAA13237.1; 

DR MGD; MGI: 1201406; Slcl0a2. 

DR GO; GO: 0016021; Crintegral to membrane; IEA. 

DR GO; GO: 0008508; F:bile acid: sodium symporter activity; IEA. 

DR GO; GO: 0015711; P:organic anion transport; IEA. 

DR GO; GO: 0006814; P: sodium ion transport; IEA. 

DR InterPro; IPR002657; BilAc/Na_symport . 

DR InterPro; IPR004710; Bil_ac_transpt . 

DR Pfam; PF01758; SBF; 1. 

DR TIGRFAMs; TIGR00841; bass; 1. 

SQ SEQUENCE 348 AA; 38134 MW; D00B5E43431875D7 CRC64 ; 



Query Match 44.0%; Score 871; DB 11; Length 348; 

Best Local Similarity 47.4%; Pred. No. 1.4e-63; 

Matches 167; Conservative 74; Mismatches 97; Indels 14; Gaps 



5; 



Qy 

Db 



7 SSSACPANSS — EEELPVGLEVHGN— LELVFTWSTVMMGLLMFSLGCSVEIRKLWSHI 62 

: I I I I I : : i : I I : I I I : I I : : : : : I I i : I I : I I : I II 
3 NSSVCPPNATVCEGDSCWPESNFNAI LNTVMSTVLTILLAMVMFSMGCNVEVHKFLGHI 62 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



63 RRPWGIAVGLLCQFGLMPFTAYLLAISFSLKPVQAIAVLIMGCCPGGTISNIFTFWVDGD 122 

: I I I I I II llllhll I : : I : : : : MM: I I I I I I I I I I I Mi : I : I I I 
63 KRPWGIFVGFLCQFGIMPLTGFILSVASGILPVQAVWLIMGCCPGGTGSNILAYWIDGD 122 

123 MDLSISMTTCSTVAALGMMPLCIYLYTWSWSLQQNLTIPYQNIGITLVCLTIPVAFGVYV 182 

M I I : I I I M I I : I I I II I II : : : I I I : III : M I : M I II I : M :: I 

123 MDLSVSMTTCSTLLALGMMPLCLFVYTKMWVDSGTIVI PYDSIGI SLVALVI PVSFGMFV 182 

183 NYRWPKQSKIILKIGAWGGVLLLWAVAGWLAKGSWNSDITLLTISFIFPLIGHVTGF 242 

I : : II : : : I I II I II : : I : I : : : : II I : I : : I : I I I I I : I : II 

183 NHKWPQKAKIILKIGSITGVILIVLIAVIGGILYQSAWIIEPKLWIIGTIFPIAGYSLGF 242 

243 LLALFTHQSWQRCRTISLETGAQNIQMCITMLQLSFTAEHLVQMLSFPLAYGLFQLIDGF 302 

II I I II I I :: I I I I I I I: I I :: I I M : I I : : II I I : I I I : 

243 FliARLAGQPWYRCRTVALETGMQNTQLCSTIVQLSFSPEDLNLVFTFPLIYTVFQLVFAA 302 

303 LIVAAYQTYKRRLKNKHGKKNSGCTEVCHTRKSTSSR ETNAFLEVNEE 350 

: I : III:: Ml : : I I II III : : I : 

303 VILGIYVTYRK CYGKNDAEFLE — KTDNEMDSRPSFDETNKGFQPDEK 348 



RESULT 4 
Q925U7 

ID Q925U7 PRELIMINARY; PRT; 34 8 AA. 

AC Q925U7; 

DT 01-DEC-2001 (TrEMBLrel . 19, Created) 

DT 01-DEC-2001 (TrEMBLrel. 19, Last sequence update) 

DT 01-JUN-2003 (TrEMBLrel. 24, Last annotation update) 

DE Ileal sodium-dependent bile acid transporter. 

OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus, 

OX NCBI_TaxID=10090; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=C57BL; 

RA Dawson P. A., Craddock A.L., Tietjen M.E., Haywood J.H.; 



RT "Disruption of the Ileal Bile Acid Transporter Gene in Mice."; 

RL Submitted (MAY-2000) to the EMBL/ GenBank/DDB J databases. 

DR EMBL; AF271073; AAK55514.1; -. 

DR EMBL; AF266724; AAK55514.1; JOINED. 

DR EMBL; AF266725; AAK55514.1; JOINED. 

DR EMBL; AF266726; AAK55514.1; JOINED. 

DR EMBL; AF266727; AAK55514.1; JOINED. 

DR EMBL; AF266728; AAK55514.1; JOINED. 

DR GO; GO:0016021; Ciintegral to membrane; IEA. 

DR GO; GO: 0008508; F:bile acid:sodium symporter activity; IEA. 

DR GO; GO: 0015711; P:organic anion transport; IEA. 

DR GO; GO: 0006814; P: sodium ion transport; IEA. 

DR InterPro; IPR002657; BilAc/Na_symport . 

DR InterPro; IPR004710; Bil__ac_transpt . 

DR Pfam; PF017 58; SBF; 1. 

DR TIGRFAMs; TIGR00841; bass; 1. 

SQ SEQUENCE 348 AA; 38094 MW; AD33A1BC76A444 82 CRC64; 



Query Match 43.8%; Score 866; DB 11; Length 348; 

Best Local Similarity 47.4%; Pred. No. 3.6e-63; 

Matches 167; Conservative 73; Mismatches 98; Indels 14; Gaps 5; 

Qy 7 SSSACPANSS — EEELPVGLEVHGN— LELVFTWSTVMMGLLMFSLGCSVEIRKLWSHI 62 

: I I I I I : : I : I I : I I I : I I : : : : I I I : I I : I I : I II 
Db 3 NSSVCPPNATVCEGDSCWPESNFNAVLNTVKSSVLTTL^MVMFSMGCNVEVHKFLGHI 62 

Qy 63 RRPWGIAVGLLCQFGLMPFTAYLLAISFSLKPVQAIAVLIMGCCPGGTISNIFTFWVDGD 122 

: I I I I I II I I I I I : I I I : : I : : : : I I I I : I I I I I I I II I I III : I : I I I 
Db 63 KRPWGIFVGFLCQFGIMPLTGFILSVASGILPVQAVWLIMGCCPGGTGSNILAYWIDGD 122 

Qy 123 MDLS I SMTTCSTVAALGMMPLCI YLYTWSWSLQQNLTI PYQNIGITLVCLTI PVAFGVYV 182 

I I I I : I I I I I I I : I I I I I I I I ::: I I I : III : I I I : I I I I I I : I I : : I 

Db 123 MDLSVSMTTCSTLLALGMMPLCLFVYTKMWVT)SGTIVT PYDSIGI SLVALVI PVS FGMFV 182 

Qy 183 N Y RW PKQSKIILKI G AWG GVL L L WAVAG WLAK G S WN SDITLLTISFIFPLIG H VT G F 242 

I : : I I : : : I I I I I I I : : I : I : : : : I I I : I : : I : I I I I I : I : II- 
Db 183 NHKWPQKAKIILKIGSITGVILIVLIAVIGGILYQSAWIIEPKLWIIGTIFPIAGYSLGF 242 

Qy 243 LLALFTHQSWQRCRTISLETGAQNIQMCITMLQLSFTAEHLVQMLSFPLAYGLFQLIDGF 302 

II I I I I I I :: I I I I I I I : I I :: I I I I : I I : : I I I I : I I I : 

Db 24 3 FLARLAGQPWYRCRTVALETGMQNTQLCSTIVQLSFSPEDLNLVFTFPLIYTVFQLVFAA 302 

Qy 303 LIVAAYQTYKRRLKNKHGKKNSGCTEVCHTRKSTSSR ETNAFLEVNEE 350 

: I : III:: : I I : : I I II III : : I : 

Db 303 VI LGI YVTYRK CYGKNDAEFLE--KTDNEMDSRPSFDETNKGFQPDEK 348 



RESULT 5 
097736 

ID 097736 PRELIMINARY; PRT; 348 AA. 

AC 097736; 

DT 01-MAY-1999 (TrEMBLrel. 10, Created) 
DT 01-MAY-1999 (TrEMBLrel. 10, Last sequence update) 
DT 01-JUN-2003 (TrEMBLrel. 24, Last annotation update) 
DE Hepatic sodium-dependent bile acid transporter. 
OS Oryctolagus cuniculus (Rabbit) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 



OC Mammalia; Eutheria; Lagomorpha; Leporidae; Oryctolagus. 

OX NCBI_TaxID=998 6; 

RN [1] 

RP SEQUENCE FROM N. A. 

RA Stengelin S., Becker W., Maier M. , Noll R. , Kramer W. ; 

RT "Rabbit cDNA encoding hepatic sodium-dependent bile acid 

RT transporter."; 

RL Submitted (DEC-1998) to the EMBL/ GenBank/DDBJ databases. 

DR EMBL; AJ131361; CAA10360.1; -. 

DR GO; GO: 0016021; C: integral to membrane; IEA. 

DR GO; GO: 0008508; F:bile acid: sodium symporter activity; IEA. 

DR GO; GO: 0015711; P:organic anion transport; IEA. 

DR GO; GO:0006814; P:sodium ion transport; IEA. 

DR InterPro; IPR002657; BilAc/Na_symport . 

DR InterPro; IPR004710; Bil_ac_transpt . 

DR Pfam; PF01758; SBF; 1. 

DR TIGRFAMs; TIGR00841; bass; 1. 

SQ SEQUENCE 348 AA; 37932 MW; 992A08F4AAA448 9B CRC64; 

Query Match 28.7%; Score 567.5; DB 6; Length 348; 

Best Local Similarity 38.9%; Pred. No. 1.2e-38; 

Matches 112; Conservative 69; Mismatches 100; Indels 7; Gaps 3; 

ELVFTWSTVMMGLLMFS LGCSVEI RKLWSHI RRPWGI AVGLLCQFGLMP FTAYLLAI SF 90 

: I : | : : I : : I I I I I : : I I : : I : I I : I : I : I : I : I I I I : : I I 
DLALSVILVIMLLTIMLSLGCTMEFSKIKAHFLKPKGLAIALVAQYGIMPLTAFVLGKVF 83 

SLKPVQAIAVLIMGCCPGGT I SNI FT FWVDGDMDL S I SMTTCSTVAALGMMPLCI YLYT- 149 

: : : I : I : I : II I II : I I : I : I I I I : I I I I I I I I I I I II I I I : I : I : 
RMNNIEALAI LVCGCSPGGNMSNLFSLAVKGDMNLSIVMTTCSTFLALGMMPLLLYIYSR 143 



II I I : I I : I I I : : : I : : : I : I 



Ml : I : I I I I I : I I I : I I : I : I : I I I I : I I I I 



I : I : I I : I : : I I : : I I I I : I I I : I I I : I : : I : : 

NVOLCST I LNVTFAPEVI GPLFFFPLLYMI FOLAEGLLI I AVFRCYEK 309 



Qy 


31 


Db 


24 


Qy 


91 


Db 


84 


Qy 


150 


Db 


144 


Qy 


208 


Db 


202 


Qy 


266 


Db 


262 



RESULT 6 
035940 

ID 035940 PRELIMINARY; PRT; 317 AA. 

AC 035940; 

DT 01-JAN-1998 (TrEMBLrel . 05, Created) 

DT 01-JAN-1998 (TrEMBLrel. 05, Last sequence update) 

DT 01-JUN-2003 (TrEMBLrel. 24, Last annotation update) 

DE Na/taurocholate cotransporting polypeptide 2. 

GN SLC10A1 OR NTCP . 

OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 

OX NCBI TaxID=10090; 



RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=Balb/C; 

RA Hagenbuch B. ; 

RT "Identification of two forms of the Na/taurocholate cotransporting 

RT polypeptide in mouse liver."; 

RL Submitted (MAR-1997) to the EMBL/ GenBank/DDB J databases. 

DR EMBL; U95132; AAB81024.1; -. 

DR MGD; MGI: 97379; SlclOal. 

DR GO; GO: 0016021; C: integral to membrane; IEA. 

DR GO; GO: 0008508; F:bile acid: sodium symporter activity; IEA. 

DR GO; GO: 0015711; P:organic anion transport; IEA. 

DR GO; GO: 0006814; P: sodium ion transport; IEA. 

DR InterPro; IPR002657; BilAc/Na_symport . 

DR InterPro; IPR004710; Bil_ac_transpt . 

DR Pfam; PF01758; SBF; 1. 

DR TIGRFAMs; TIGR00841; bass; 1. 

SQ SEQUENCE 317 AA; 34886 MW; DA32C829C8A8E6D0 CRC64; 

Query Match 27.6%; Score 546; DB 11; Length 317; 

Best Local Similarity 37.2%; Pred. No. 6.6e-37; 

Matches 118; Conservative 70; Mismatches 117; Indels 12; Gaps 6; 

Qy 10 ACPANSSEEELPVGLEV1IGNLELVFTWSTVMMGLLMFSLGCSVEIRKLWSHIRRPWGIA 69 

: | I I Ml I : : I : I I : I : I I I I I : : I I : : I : I I : 

Db 7 SAPFNFS LPPGFG-HRATDTALSVILWMLLLIMLSLGCTMEFSKIKAHFWKPKGVI 62 

Qy 70 VGLLCQFGLMPFTAYLLAISFSLKPVQAIAVLIMGCCPGGTISNIFT FWVDGDMDLSISM 129 

: : : I : I : I I : I : I I II : : I : I : I I I I I I I : I I : I I : I I I : I I I I 

Db 63 IAIVAQYGIMPLSAFLLGWFHLTSIEALAILICGCSPGGNLSNLFTLAMKGDMNLSIVM 122 

Qy 130 TTCSTVAALGMMPLCIYLYT WSWSLQQNLTIPYQNIGITLVCLTIPVAFGVYVNYRW 186 

Mil: I I I I I I I : I : I : : I : : | | : | : : I I : I I I I : : : : 

Db 123 TTCSSFTALGMMPLLLYIYSKGIYDGDLKDK — VPYKGIMLSLVMVLIPCAIGIFLKSKR 180 

Qy 187 PKQSKIILKI GAVVGGVLLL VVAVAGVVLAKGS WN SDIT--LLTISFIFPLI GHVT GFLL 244 

I : I I I : : I : I I I : I : I II I : I I : I : : I 

Db 181 PHWPWLKAGMIITFSLSVAVTVLSVINVGNSIMFVMTPHLLATSSLMPFTGFLMGYIL 240 

Qy 245 ALFTHQSWQRCRTISLETGAQNIQMCITMLQLSFTAEHLVQMLSFPLAYGLFQLIDGFLI 304 

: : I I I I : I I I I I : I : I I : I s : I I : : I I I I : I I I : I I 

Db 241 SALFRLNPSCRRTISMETGFQNVQLCSTILNVTFPPEVIGPLFFFPLLYMIFQLAEGLLF 300 

Qy 305 VAAYQT Y KRRL KN KH GK 321 

: : : I : : I : I I 
Db 301 IIIFRCY-LKIKPQKGK 316 



RESULT 7 
Q8WUZ2 

ID Q8WUZ2 PRELIMINARY; PRT; 437 AA. 

AC Q8WUZ2; 

DT 01-MAR-2002 (TrEMBLrel. 20, Created) 

DT 01-MAR-2002 (TrEMBLrel. 20, Last sequence update) 

DT 01-JUN-2003 (TrEMBLrel. 24, Last annotation update) 

DE Hypothetical protein. 

OS Homo sapiens (Human) . 



OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Brain; 

RA Strausberg R. ; 

RL Submitted (DEC-2001) to the EMBL/ GenBank/DDB J databases. 

DR EMBL; BC019066; AAH19066.1; -. 

DR GO; GO: 0016020; C:membrane; IEA. 

DR GO; GO: 0008508; F:bile acid: sodium symporter activity; IEA. 

DR GO; GO: 0006814; P: sodium ion transport; IEA. 

DR InterPro; IPR002657; BilAc/Na_symport . 

DR Pfam; PF01758; SBF; 1. 

KW Hypothetical protein. 

SQ SEQUENCE 437 AA; 46503 MW; 055E98 962 9CC13D1 CRC64; 



Query Match 27.0%; Score 535; DB 4; Length 437; 

Best Local Similarity 35.7%; Pred. No. 7.3e-36; 

Matches 111; Conservative 59; Mismatches 101; Indels 40; Gaps 5; 

Qy 27 HGNLELVFTWSTVMMGLLMFSLGCSVEIRKLWSHIRRPWGIAVGLLCQFGLMPFTAYLL 86 

II I : 1:1 I I I : I : : : I : I I I I : I I I I I I : I I : I I 

Db 103 HGLNVFVGAALCITMLG LGCTVDVNHFGAHVRRPVGALLAALCQFGLLPLLAFLL 157 

Qy 87 AISFSLKPVQAIAVLIMGCCPGGTISNIFTFWVDGDMDLSISMTTCSTVAALGMMPLCIY 146 

I : : I I I I : I I I : 1 I I I I I : I I : : I i I I I : I I I I I I I : I I : I I I I : : 
Db 158 ALAFKLDEVAAVAVLLCGCCPGGNLSNLMSLLVDGDMNLSIIMTISSTLLALVLMPLCLW 217 

Qy 147 LYTWSW SLQQNLTI PYQNI GITLVCLTI PVAFGVYVNYRWPKQSKI ILKI 196 

: I : I : I : I : I : : I I II: I I : : I : : : : I : I : 

Db 218 IYSWAWINTPIVQ— LLPLGTVTLTLCSTLIPIGLGVFIRYKYSRVADYIVKVSLWSLLV 275 

Qy 197 GAWGGVLLLWAVAGWLAKGSWNSDITLLTISFIFPLIGHVTGFLLALF 247 

I : : I I I : I I : I I I I : : I : I I 

Db 276 TLWLFIMTGTMLGPELLASIPAAVYVIA IFMPLAGYASGYGLATL 321 

Qy 248 THQSWQRCRTISLETGAQNIQMCITMLQLSFTAEHLVQMLSFPLAYGLFQLIDGFLIVAA 307 

I I I : I I I I : I I : I : I : I : I : I : : I I I I I I I I : : I 

Db 322 FHLPPNCKRTVCLETGSQNVQLCTAILKLAFPPQFIGSMYMFPLLYALFQSAEAGIFVLI 381 

Qy 308 YQTYKRRLKNK 318 

I : I : : I 
Db 382 YKMYGS EMLHK 392 



RESULT 8 
Q96EP9 

ID Q96EP9 PRELIMINARY; PRT; 462 AA. 

AC Q96EP9; 

DT 01-DEC-2001 (TrEMBLrel. 19, Created) 

DT 01-DEC-2001 (TrEMBLrel. 19, Last sequence update) 

DT 01-OCT-2003 (TrEMBLrel. 25, Last annotation update) 

DE Hypothetical protein (Fragment) . 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 



OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Brain; 

RA Strausberg R. ; 

RL Submitted (AUG-2 001) to the EMBL/ GenBank/DDBJ databases. 

DR EMBL; BC012048; AAH12048.1; 

DR Genew; HGNC: 22980; SLC10A4 . 

DR GO; GO: 0016020; C : membrane; IEA. 

DR GO; GO:0008508; F:bile acid:sodium symporter activity; IEA. 

DR GO; GO: 0006814; P: sodium ion transport; IEA. 

DR InterPro; IPR002657; BilAc/Na_symport . 

DR Pfam; PF01758; SBF; 1. 

KW Hypothetical protein. 

FT NON_TER 1 1 

SQ SEQUENCE 462 AA; 49035 MW; B916D68AEE40622C CRC64; 

Query Match 27.0%; Score 535; DB 4; Length 462; 

Best Local Similarity 35.7%; Pred. No. 7.7e-36; 

Matches 111; Conservative 59; Mismatches 101; Indels 40; Gaps 5; 

Qy 27 HGNLELVFTWSTVMMGLLMFSLGCSVEIRKLWSHIRRPWGIAVGLLCQFGLMPFTAYLL 86 

|| | : | : | I I I : I : : : I : I I I I : I I I I I I : I I : I I 

Db 128 HGLNVFVGAALCITMLG LGCTVDVNHFGAHVRRPVGALLAALCQFGLLPLLAFLL 182 

Qy 87 AI S FSLKPVQAI AVLIMGCCPGGTI SNI FTFWVDGDMDLS I SMTTCST VAALGMMPLCI Y 146 

I : : I I | I : I I I : I I I I I I : I I : : I I I I I : I I I I I I I : I I : I I I I : : 
Db 183 ALAFKLDEVAAVAVLLCGCCPGGNLSNLMSLLVDGDMNLSIIMTISSTLLALVLMPLCLW 242 

Qy 147 LYTWSW SLQQNLTIPYQNIGITLVCLTIPVAFGVYVNYRWPKQSKIILKI 196 

: | : | : | : | : I : : I I I I : | | : : | : : : : I : I : 

Db 243 IYSWAWINTPIVQ— LLPLGTVTLTLCSTLI PIGLGVFIRYKYSRVADYIVKVSLWSLLV 300 

Qy 197 GAVVGGVIjL LVVAVAGVVLAKG S WN SDITLLTISFIFPLI GHVT G F LLAL F 247 

I : : I II : I I : I I I I : : I : I I 

Db 301 TLVVLFIMTGTMLGPELLASI PAAVYVIA 1 FMP LAG YAS GYGLAT L 346 

Qy 248 T HQ S WQ RC RT I S L ET GAQN I QMC I TMLQ L S FT AEH L VQML S F P LAY GL FQLIDGFLI VAA 307 

I I I : I I I I : I I : I : I : I : I : I : : I I I I I I I I : : I 

Db 347 FHLPPNCKRTVCLETGSQNVQLCTAILKLAFPPQFIGSMYMFPLLYALFQSAEAGIFVLI 406 

Qy 308 YQTYKRRLKNK 318 

I : I : : I 

Db 407 YKMYGSEMLHK 417 



RESULT 9 
Q8BJC7 

ID Q8BJC7 PRELIMINARY; PRT; 437 AA. 

AC Q8BJC7; 

DT 01-MAR-2003 (TrEMBLrel. 23, Created) 

DT 01-MAR-2003 (TrEMBLrel. 23, Last sequence update) 

DT 01-JUN-2003 (TrEMBLrel. 24, Last annotation update) 

DE Hypothetical sodium bile acid symporter containing protein. 

OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 



OX NCBI_TaxID=10090; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=C57BL/6J; TISSUE=Eye; 

RX MEDLINE=22354683; PubMed=12466851; 

RA The FANTOM Consortium, 

RA the RIKEN Genome Exploration Research Group Phase I & II Team; 

RT "Analysis of the mouse transcriptome based on functional annotation of 

RT 60,770 full-length cDNAs . " ; 

RL Nature 420:563-573(2002). 

DR EMBL; AK087479; BAC39890.1; 

DR GO; GO: 0016020; C:membrane; IEA. 

DR GO; GO: 0008508; F:bile acid: sodium symporter activity; IEA. 

DR GO; GO: 0006814; P: sodium ion transport; IEA. 

DR InterPro; IPR002657; BilAc/Na__symport . 

DR Pfam; PF01758; SBF; 1. 

KW Hypothetical protein. 

SQ SEQUENCE 437 AA; 46599 MW; EF84A0A79C4CE503 CRC64; 

Query Match 26.1%; Score 516; DB 11; Length 437; 

Best Local Similarity 34.2%; Pred. No. 2.7e-34; 

Matches 122; Conservative 63; Mismatches 132; Indels 40; Gaps 8; 

Qy 8 SSACPANSSEEELPV GLEVHGNLELVFTWSTVMMGLLMFSLGCSVEIRKLWS 60 

| I I I II III II I I I I : I : : : 

D b 82 S S AF P RPWI PH EP P FWDT P LNHGLN VFVGAALC I T MLGLGCTVDVNHFGA 131 

Q y 61 HIRRPWGIAVGLLCQFGLMPFTAYLLAISFSLKPVQAIAVLIMGCCPGGTISNIFTFWVD 120 

| : | | | | : I I I I I : I I : I I I : I I I I : I I I : I I I I I I : I I : : I I 

Db 132 HVRRPVGALLAALCQFGFLPLLAFLLALIFKLDEVAAVAVLLCGCCPGGNLSNLMSLLVD 191 

Qy 121 GDMDLSISMTTCSTVAALGMMPLCIYLYTWSW— SLQQNLTIPYQNIGITLVCLTIPVA 177 

111:11111 | | : | | : I I I I : : : I : : I II : I : : I I I I : 

Db 192 GDMNLSIIMTISSTLLALVLMPLCLWIYSRAWINTPLVQ--LLPLGAVTLTLCSTLIPIG 249 

Qy 178 FGVYVN YRW PKQSKIILKI GAWGGVL L LWAV- AGWLAKGSWN S-DITLLTISFIFPL 235 

I I : : I : : : : I : I : I I : I : : I : I I I = • • II 

Db 250 L G VF I R Y K YN RVAD Y I VKVS LW S L L VT LWL F I MT GTMLGPEL LAS I P AT VYVVAI FM P L 309 

Qy 236 IGHVTGFLLALFTHQSWQRCRTISLETGAQNIQMCITMLQLSFTAEHLVQMLSFPLAYGL 295 

| : : I : I I I I I : I I I I : I I : I : I : I : I : I = I I I I I I 

Db 310 AGYASGYGLATLFHLPPNCKRTVCLETGSQNVQLCTAILKLAFPPRFIGSMYMFPLLYAL 369 

Qy 296 FQLI DGFLI VAAYQT YKRRLKNKHGKKNSGCTEVCHTRKST S S RETN — AFLEVNEE 350 

| | : : I I : I : | : | I : : I : : : : I I 

Db 370 FQSAEAGVFVLIYKMYG SEILHKREALDEDEDTDISYKKLKEE 412 



RESULT 10 
Q7YS68 

ID Q7YS68 PRELIMINARY; PRT; 143 AA. 

AC Q7YS68; 

DT 01-OCT-2003 (TrEMBLrel. 25, Created) 
DT 01-OCT-2003 (TrEMBLrel. 25, Last sequence update) 
DT 01-OCT-2003 (TrEMBLrel. 25, Last annotation update) 
DE Truncated sodium-dependent bile acid transporter. 
OS Oryctolagus cuniculus (Rabbit) . 



OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Lagomorpha; Leporidae; Oryctolagus . 

OX NCBI_TaxID=9986; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE-Ileum; 

RA Weihrauch D., Ao M. , Rao M.C.; 

RT "Expression of bile acid transporter and bile acid receptors in the 

RT developing rabbit colon."; 

RL Submitted (MAY-2003) to the EMBL/ GenBank/ DDB J databases. 

DR EMBL; AY292653; AAP49247.1; 

SQ SEQUENCE 143 AA; 15339 MW; 037ED60C2E3CAF5F CRC64; 

Query Match 18.3%; Score 363; DB 6; Length 143; 

Best Local Similarity 52.7%; Pred. No. 3.2e-22; 

Matches 68; Conservative 24; Mismatches 31; Indels 6; Gaps 3; 

CPANSS — EEELPVGLEVHGN — LELVFTWSTVMMGLLMFSLGCSVEIRKLWSHIRRPW 66 
|||:: | | | : I I : I : I I : : : I : I I I : I I : I I I : I I I I I I I 



|| : I I I I I I : I I I : : I I : : I : I : I I : I I I I I I I I I I I III : I I I I I I I I 



Qy 


11 


Db 


8 


Qy 


67 


Db 


68 


Qy 


127 


Db 


128 



: I I 



RESULT 11 
Q8VI83 

ID Q8VI83 PRELIMINARY; PRT; 125 AA. 

Q8VI83; 

01-MAR-2002 (TrEMBLrel. 20, Created) 
01-MAR-2002 (TrEMBLrel. 20, Last sequence update) 
01-JUN-2003 (TrEMBLrel. 24, Last annotation update) 
Ileal sodium-dependent bile acid transporter (Fragment) 
ISBT. 

Mus musculus (Mouse) . 
Eukaryota; Metazoa; Chordata; 

Rodent ia; 



AC 
DT 
DT 
DT 
DE 
GN 
OS 
OC 
OC 
OX 
RN 
RP 
RA 
RT 
RT 
RL 
DR 
DR 
DR 
DR 
DR 
DR 
FT 
SQ 



Craniata; Vertebrata; Euteleostomi; 
Sciurognathi; Muridae; Murinae; Mus. 



Mammalia ; Eutheria ; 
NCBI_TaxID=10090; 
[1] 

SEQUENCE FROM N.A. 

Saeki T., Kirifuji K., Kanamoto R., Iwami K. ; 

"Identification of transcription start sites in mouse ileal sodium- 
dependent bile acid transporter gene."; 

Submitted (JAN-2002) to the EMBL/GenBank/DDBJ databases. 
EMBL; AB078635; BAB84081.1; -. 
GO; GO: 0016020; C:membrane; IEA. 

GO; GO: 0008508; F:bile acid: sodium symporter activity; IEA. 
GO; GO:0006814; P:sodium ion transport; IEA. 
InterPro; IPR002657 ; BilAc/Na_symport . 
Pfam; PF01758; SBF; 1. 
NON_TER 125 12 5 

SEQUENCE 125 AA; 13275 MW; C7F8EFC459D4C8F7 CRC64; 



Query Match 17.8%; Score 352; DB 11; Length 125; 

Best Local Similarity 52.0%; Pred. No. 2.2e-21; 

Matches 64; Conservative 25; Mismatches 30; Indels 4; Gaps 

Qy 7 SSSACPANSS— EEELPVGLEVHGN— LELVFTWSTVMMGLLMFSLGCSVEIRKLWSHI 6 

: I I I I I : : | : I I : I I | : | | : : : : : I I I : I I : I I : I II 
Db 3 NSSVCPPNATVCEGDSCWPESNFNAI LNTVMSTVLTILLAMVMFSMGCNVEVHKFLGHI 6 

Qy 63 RRPWGIAVGLLCQFGLMPFTAYLLAISFSLKPVQAIAVLIMGCCPGGTISNIFTFWVDGD 1 

: | | | | | || 11111:11 I : : I : : : : MM: I I II I I II II I III : I : I I I 

D b 63 KRPWGIFVGFLCQFGIMPLTGFILSVASGILPVQAVWLIMGCCPGGTGSNILAYWIDGD 1 

Qy 123 MDL 125 

I I I 

Db 123 MDL 125 



RESULT 12 




Q8QZR2 




ID 


Q8QZR2 PRELIMINARY; PRT; 473 AA. 




AC 


Q8QZR2; 




DT 


01-JUN-2002 (TrEMBLrel. 21, Created) 




DT 


01-JUN-2002 (TrEMBLrel. 21, Last sequence update) 




DT 


01-OCT-2003 (TrEMBLrel. 25, Last annotation update) 




DE 


Similar to protein P3 (Hypothetical protein) . 




OS 


Mus musculus (Mouse) . 




OC 


Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 


OC 


Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 


OX 


NCBI TaxID=10090; 




RN 


[1] 




RP 


SEQUENCE FROM N.A. 




RA 


Strausberg R. ; 




RL 


Submitted (APR-2002) to the EMBL/ GenBank/DDB J databases. 




RN 


[2] 




RP 


SEQUENCE FROM N.A. 




RC 


STRAIN=C57BL/6J; TISSUE=Thymus ; 




RX 


MEDLINE=22354 683; PubMed=124 66851; 




RA 


The FANTOM Consortium, 




RA 


the RIKEN Genome Exploration Research Group Phase I & II 


Team; 


RT 


"Analysis of the mouse transcriptome based on functional 


annotation of 


RT 


60,770 full-length cDNAs . " ; 




RL 


Nature 420:563-573(2002). 




DR 


EMBL; BC023050; AAH23050.1; -. 




DR 


EMBL; BC027440; AAH27440.1; -. 




DR 


EMBL; AK041958; BAC31110.1; -. 




DR 


GO; GO: 0016021; C: integral to membrane; IEA. 




DR 


GO; GO: 0008508; F:bile acid: sodium symporter activity; IEA. 


DR 


GO; GO: 0015711; P: organic anion transport; IEA. 




DR 


GO; GO: 0006814; P:sodium ion transport; IEA. 




DR 


InterPro; IPR002657; BilAc/Na_symport . 




DR 


InterPro; IPR004710; Bil_ac_transpt . 




DR 


Pfam; PF01758; SBF; 1. 




DR 


TIGRFAMs; TIGR00841; bass; 1. 




KW 


Hypothetical protein. 




SQ 


SEQUENCE 473 AA; 50254 MW; 9A2AD0A005DD18 05 CRC64; 





Query Match 17.8%; Score 351.5; DB 11; Length 473; 

Best Local Similarity 32.4%; Pred. No. 9.5e-21; 

Matches 91; Conservative 57; Mismatches 108; Indels 25; Gaps 



5; 



Qy 11 CPANSSEEELPVGLEVH-GNLE LVFT WSTVMMGLLMFS LGCS VE I RKLWSH I RRPW 66 

I I I : I I I : I : ::::::: I I I I I : I : : I 

Db 163 CIRVS PAEDLPSALNTNLGHFSENPILYLLLPLIFVNKCSF — GCKVELEVLKELLQSPQ 220 

Qy 67 GIAVGLLCQFGLMPFTAYLLAISFSLKPVQAIAVLIMGCCPGGTISNIFTFWVDGDMDLS 126 

: : I I I I I : I I I I : I : I II I : : : I III I : I : : I I : I : 

Db 221 PMLLGLLGQFLVMPFYAFLMAKVFMLPKALALGLIITCSSPGGGGSYLFSLLLGGDVTLA 280 

Qy 127 ISMTTCSTVAALGMMPLCIYLYTWSWSLQQNLTIPYQNIGITLVCLTIPVAFGVYVNYRW 186 

I I I I I I I I I I : I I : I : : I : : I : I I I I : : I I : I I I : : 
Db 281 ISMTFISTVAATGFLPLSSAI YSYLLSIHETLHVPISKILGTLLFIAIPIAAGWIKSKL 340 

Qy 187 PKQSKIILKIGAWGGVLLL WAVAGWLAKGSWNSDITLLTISFI FPLI 236 

I I I : : : I : : : I I I I : I I I : : : | | | : 

Db 341 PKFSELLLQVIKPFSFILLLGGLFLAYHMGVFILVGVRL PIVLVGFTVPLV 391 

Qy 237 GH VT GFL LAL FT HQ S WQ RC RT I S L ET GAQN I QMC I TMLQL S 277 

I : I : I I : : I I : i : I I I I : : I I I I I 

Db 392 GLLVGYSLAICLKLPVAQRRTVSIEVGVQNSLLALAMLQLS 432 



RESULT 13 
Q9QZJ2 

ID Q9QZJ2 PRELIMINARY; PRT; 187 AA. 

AC Q9QZJ2; 

DT 01-MAY-2000 (TrEMBLrel. 13, Created) 

DT 01-MAY-2000 (TrEMBLrel. 13, Last sequence update) 

DT 01-JUN-2003 (TrEMBLrel. 24, Last annotation update) 

DE Na-taurocholate cotransporting polypeptide (Fragment) . 

OS Mesocricetus auratus (Golden hamster) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Cricetinae; 

OC Mesocricetus. 

OX NCBI_TaxID=10036; 

RN [1] 

RP SEQUENCE FROM N.A. 

RA Balasubramanian N., Arrese M. , Suchy F.J., Ananthanarayanan M. ; 

RT "Na-Taurocholate cotransporting polypeptide (Ntcp) from Hamster 

RT liver."; 

RL Submitted (AUG-1999) to the EMBL/ GenBank/DDB J databases. 

DR EMBL; AF181258; AAD53961.1; -. 

DR GO; GO: 0016020; C:membrane; IEA. 

DR GO; GO: 0008508; F:bile acid: sodium symporter activity; IEA. 

DR GO; GO:0006814; P:sodium ion transport; IEA. 

DR InterPro; IPR002657; BilAc/Na_symport . 

DR Pfam; PF01758; SBF; 1. 

FT NONJTER 1 1 

FT NON_TER 187 187 

SQ SEQUENCE 187 AA; 20182 MW; 2 855C5F44AB4 82C6 CRC64; 



Query Match 17.7%; Score 351; DB 11; Length 187; 

Best Local Similarity 42.6%; Pred. No. 4.1e-21; 

Matches 80; Conservative 34; Mismatches 64; Indels 10; Gaps 



5; 



Qy 75 QFGLMPFTAYLLAISFSLKPVQAIAVLIMGCCPGGTISNIFTFWVDGDMDLSISMTTCST 134 

111:11 I : : I I I I I : : I : I : I I II III : I I : I I : I I I : I I I I I I I I I 

Db 1 QFGIMPL7VAFVLGKVFHLKPIEALAILICGCSPGGNLSNLFTLAMKGDMNLSIVMTTCST 60 

Qy 135 VAALGMMPLCIYLYT WSWSLQQNLTIPYQNIGITLVCLTIPVAFGVYVNYRWPKQSK 191 

I I I I I I I I : I : I I : I : : I I | I : | I : I I : I : : : : I : 
Db 61 FAALGMMPLLLYIYTKGIYDGDLKDK — VPYGGIMISLVMVLIPCSLGI FLKTKRPQYVP 118 

Qy 192 IILKIGAWGGVLLLWAVAGWLAKGSWNSDIT — LLTISFIFPLIGHVTGFLL-ALFT 248 

I : I I : : I : I I : : I : I II I : I I : I : I I I I 

Db 119 YIIKGGMTITFLLSVAVTVLSIINVGNSIKFAMTPPLLATSSLMPFSGFLLGYALSALF- 177 

Qy 24 9 HQSWQRCR 256 

I I I I 

Db 178 -QLNPRCR 184 



RESULT 14 
Q9BSL2 

ID Q9BSL2 PRELIMINARY; PRT; 44 8 AA. 

AC Q9BSL2; 

DT 01-JUN-2001 (TrEMBLrel. 17, Created) 

DT 01-JUN-2001 (TrEMBLrel, 17, Last sequence update) 

DT 01-JUN-2003 (TrEMBLrel. 24, Last annotation update) 

DE Similar to protein P3. 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Brain; 

RA Strausberg R. ; 

RL Submitted (MAR-2 001) to the EMBL/ GenBank/ DDB J databases. 

DR EMBL; BC004966; AAH04966.1; -. 

DR GO; GO: 0016021; C: integral to membrane; IEA. 

DR GO; GO:0008508; F:bile acid:sodium symporter activity; IEA. 

DR GO; GO: 0015711; P: organic anion transport; IEA. 

DR GO; GO: 0006814; P: sodium ion transport; IEA. 

DR InterPro; IPR002657; BilAc/Na_symport . 

DR InterPro; IPR004710; Bil_ac_transpt . 

DR Pfam; PF01758; SBF; 1. 

DR TIGRFAMs; TIGR00841; bass; 1. 

SQ SEQUENCE 448 AA; 47548 MW; 47A12 63CF8EFFF9 1 CRC64; 

Query Match 16.9%; Score 333.5; DB 4; Length 448; 

Best Local Similarity 31.5%; Pred. No. 2.7e-19; 

Matches 87; Conservative 53; Mismatches 115; Indels 21; Gaps 3; 

Qy 12 PANSSEEELPVGLEVliGNLELVFTWSTWIMGLLMFSLGCSVEIRKLWSHIRRPWGIAVG 71 

||: | I ::::::: I I I I I : I : : I : : I 

Db 143 PAEDTPATLSADLAHFSENPILYLLLPLIFVNKCSF— GCKVELEVLKGLMQSPQPMLLG 200 



Qy 

Db 



72 LLCQFGLMPFTAYLLAI SFSLKPVQAIAVLIMGCCPGGTI SNI FTFWVDGDMDLS I SMTT 131 

I I I I : I I I : I : I II I : : : I III I : I : : I I : I : I I I I 

201 LLGQFLVTVIPLYAFLMAKVFMLPKALALGLIITCSSPGGGGSYLFSLLLGGDVTLAISMTF 260 



Qy 132 CSTVAALGMMPLCIYLYTWSWSLQQNLTIPYQNIGITLVCLTIPVAFGVYVNYRWPKQSK 191 

||||||:|| : i : I : : I : I I I I : : I I : I I I : : I I I : 
Db 261 LSTVAATGFLPLSSAIYSRLLSIHETLHVPISKILGTLLFIAIPIAVGVLIKSKLPKFSQ 320 

Qy 192 1 1 LKI GAWGGVLLL WAVAGWLAKG S WN SDITLLTISFIFPLI GHVT G 241 

: : I : : I I I I I : I I : I : : : I I : I : I 

Db 321 LLLQWKP FS FVL L L GG LF LAY RMGVF I LAGI RL PIVLVGITVPLVGLLVG 371 

Qy 242 FLLALFTHQSWQRCRTISLETGAQNIQMCITMLQLS 277 

: M : | | : | : | I I I : : I I I I I 

Db 372 YC LAT C L K L P VAQ RRT VS I E VG VQN S L LALAMLQ L S 407 



RESULT 15 
034524 

ID 034524 PRELIMINARY; PRT; 321 AA. 

AC 034524; 

DT 01-JAN-1998 (TrEMBLrel. 05, Created) 

DT 01-JAN-1998 (TrEMBLrel. 05, Last sequence update) 

DT 01-JUN-2003 (TrEMBLrel. 24, Last annotation update) 

DE Putative transporter. 

GN YOCS . 

OS Bacillus subtilis . 

OC Bacteria; Firmicutes; Bacillales; Bacillaceae; Bacillus. 

OX NCBI_TaxID=1423; 

RN [1] 

RP SEQUENCE FROM N.A. 

RA Lapidus A., Galleron N., Sorokin A., Ehrlich D.; 

RL Submitted (NOV-1997) to the EMBL/ GenBank/DDBJ databases. 

RN [2] 

RP SEQUENCE FROM N.A. 

RC STRAIN-168; 

RX MEDLINE=98044033; PubMed-9384377 ; 

RA Kunst F., Ogasawara N., Moszer I . , Albertini A.M., Alloni G., 

RA Azevedo V., Bertero M.G., Bessieres P., Bolotin A., Borchert S., 

RA Borriss R. , Boursier L., Brans A. , Braun M. , Brignell S.C., Bron S., 

RA Brouillet S., Bruschi C.V., Caldwell B., Capuano V., Carter N.M., 

RA Choi S.K., Codani J. J., Connerton I.F., Cummings N.J., Daniel R.A., 

RA Denizot F. , Devine K.M. , Dusterhoft A., Ehrlich S.D., Emmerson P.T., 

RA Entian K.D., Errington J., Fabret C, Ferrari E., Foulger D., 

RA Fritz C, Fujita M. , Fujita Y., Fuma S., Galizzi A., Galleron N . , 

RA Ghim S.Y., Glaser P., Goffeau A., Golightly E.J., Grandi G., 

RA Guiseppi G. , Guy B.J., Haga K., Haiech J., Harwood C.R., Henaut A., 

RA Hilbert H., Holsappel S., Hosono S., Hullo M.F., Itaya M. , Jones L., 

RA Joris B., Karamata D., Kasahara Y., Klaerr-Blanchard M. , Klein C, 

RA Kobayashi Y. , Koetter P., Koningstein G. , Krogh S., Kumano M., 

RA Kurita K. , Lapidus A., Lardinois S., Lauber J., Lazarevic V., 

RA Lee S.M., Levine A., Liu H., Masuda S . , Mauel C, Medigue C. , 

RA Medina N., Mellado R.P., Mizuno M. , Moestl D., Nakai S., Noback M. , 

RA Noone D., O'Reilly M. , Ogawa K., Ogiwara A., Oudega B., Park S.H., 

RA Parro V., Pohl T.M., Portetelle D . , Porwollik S., Prescott A.M., 

RA Presecan E., Pujic P., Purnelle B. f Rapoport G. , Rey M. , Reynolds S., 

RA Rieger M. , Rivolta C, Rocha E., Roche B., Rose M. , Sadaie Y., 

RA Sato T . , Scanlan E., Schleich S., Schroeter R. , Scoff one F., 

RA Sekiguchi J., Sekowska A., Seror S.J., Serror P., Shin B.S., Soldo B. 

RA Sorokin A., Tacconi E., Takagi T., Takahashi H., Takemaru K., 



RA Takeuchi M. , Tamakoshi A. , Tanaka T . r Terpstra P., Tognoni A., 

RA Tosato V. , Uchiyama S., Vanclenbol M. , Vannier F., Vassarotti A., 

RA Viari A. , Wambutt R. , Wedler E., Wedler H., Weitzenegger T . , 

RA Winters P., Wipat A. , Yamamoto H. , Yamane K., Yasumoto K., Yata K. , 

RA Yoshida K. , Yoshikawa H.F., Zumstein E., Yoshikawa H., Danchin A.; 

RT "The complete genome sequence of the Gram-positive bacterium Bacillus 

RT subtilis."; 

RL Nature 390:249-256(1997). 

RN [3] 

RP SEQUENCE FROM N.A. 

RC STRAIN-168; 

RA Kunst F. , Ogasawara N., Yoshikawa H., Danchin A.; 

RL Submitted (NOV-1997) to the EMBL/ GenBank/DDBJ databases. 

DR EMBL; AF027868; AAB84443.1; 

DR EMBL; Z99114; CAB13827.1; -. 

DR PIR; E69902; E69902. 

DR GO; GO: 0016021; C:integral to membrane; IEA. 

DR GO; GO: 0008508; F:bile acid: sodium symporter activity; IEA. 

DR GO; GO: 0015711; P:organic anion transport; IEA. 

DR GO; GO: 0006814; P:sodium ion transport; IEA. 

DR InterPro; IPR002657; BilAc/Na_symport . 

DR InterPro; IPR004710; Bil_ac_transpt . 

DR Pfam; PF01758; SBF; 1. 

DR TIGRFAMs; TIGR00841; bass; 1. 

KW Complete proteome. 

SQ SEQUENCE 321 AA; 34251 MW; 0D9CCF6B36E84A96 CRC64 ; 

Query Match 16.4%; Score 325; DB 16; Length 321; 

Best Local Similarity 27.6%; Pred. No. 9.7e-19; 

Matches 84; Conservative 76; Mismatches 114; Indels 30; Gaps 12; 

Qy 33 VFTWS TVMMGLLMFS LGCSVEI RKLWSHI RRPWGIAVGLLCQFGLMPFTAYLLAI S 89 

: I I : I | : : | : : | | : | : : : : | : | | : : | : : I : : I I I : I I 

Db 32 LFTWI S S YIT I FLGI IMFGMGLTLQADDFKELVRKPWQVI I GVIAQYTIMPLVAFGLAFG 91 

Qy 90 FSLKPVQAIAVLIMGCCPGGTISNIFTFWVDGDMDLSISMTTCSTVAALGMMPLCIYLYT 149 

I I : I : : : I I I I I I I I I : I I I : I I : : : I I I I : I : I I I I : 

Db 92 LHLPAEIAVGVILVGCCPGGTASNViyiTFIAKGNTALSVAvTTISTLLAPVWPLLIMLFA 151 

Qy 150 WSWS LQQNLT I P YQN I GI T LV- CLT I PVAFGVYVN YRWPKQ- SKI I — LKI GAWGGVLL 205 

I I : : : | : : : : | : | : | : I I : I : I : : I : I 
Db 152 KEW LPVSPGSLFISILQAVLFPIIAGLIVKMFFRKQVAKAVHALPLVSVIG 202 

Qy 206 LWAVAGWLAKGSWN SDITLLTI SFI FPLIGHVTGFLLALFTHQSWQRCRTI SLET 262 

: I I : I : : I I : : : : : I I :: I I I I : = I * • I 

Db 203 - 1 VAI VSAWSGNRENLLQS GLLI FSWI LHNGI GYLLGFLCAKLLKMD YPSQKAI AI EV 261 

Qy 263 GAQNIQMCITMLQLSFTAEHLVQMLSFPLA-YGLFQLIDGFLIVAAYQTYKRRLKNKH-G 32 0 

III : I : I : : | | : : : : | : : | | : : : : | | I 

Db 262 GMQN SGLGAALATAHFSPLSAVPSAIFSVWHNLSGSML-ATY — WSKKVKKKQAG 313 

Qy 321 KKNS 324 

I : I 

Db 314 SKSS 317 



Search completed: March 23, 2004, 14:37:17 



Job time : 4 8 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM protein - protein search, using sw model 
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Sequence : 
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(without alignments) 
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Scoring table: BLOSUM62 

Gapop 10.0 , Gapext 0.5 

Searched: 141681 seqs, 52070155 residues 
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Minimum DB seq length: 0 
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Post-processing: Minimum Match 0% 

Maximum Match 100% 
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141681 



Database 



SwissProt 42:* 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 



SUMMARIES 



Result 



% 

Query 



No. 


Score 


Match 


Length 


DB 


ID 


Description 


1 


886 


44. 


8 


347 


1 


NTCI_RABIT 


Q28727 


o ileal sod 


2 


884 


44. 


7 


348 


1 


NTCI_CRIGR 


Q60414 


c ileal sod 


3 


871 


44. 


0 


348 


1 


NTCI_RAT 


Q62633 


r ileal sod 


4 


860.5 


43. 


5 


348 


1 


NTCI_HUMAN 


Q12908 


h ileal sod 


5 


559.5 


28. 


3 


362 


1 


NTCP_RAT 


P26435 


rattus norv 


6 


553 


27. 


9 


349 


1 


NTCP__HUMAN 


Q14973 


homo sapien 


7 


553 


27. 


9 


362 


1 


NTCP_MOUSE 


008705 


mus musculu 


8 


333.5 


16. 


9 


477 


1 


P3_HUMAN 


P09131 


homo sapien 


9 


180.5 


9. 


1 


182 


1 


P3_MOUSE 


P21129 


mus musculu 


10 


125 


6. 


3 


409 


1 


YCXA_BACSU 


Q08791 


bacillus su 


11 


117 


5. 


9 


721 


1 


YJIY_ECOLI 


P39396 


escherichia 


12 


116 


5. 


9 


286 


1 


YCXE__BACME 


P40419 


bacillus me 


13 


112.5 


5. 


7 


368 


1 


CYB_TOXGO 


020672 


toxoplasma 


14 


109.5 


5. 


5 


383 


1 


Y944_SYNY3 


P74311 


synechocyst 


15 


107.5 


5. 


4 


576 
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ALIGNMENTS 



RESULT 1 
NTCI_RABIT 

ID NTCI_RABIT STANDARD; PRT; 347 AA. 

AC Q28727; 

DT 15-JUL-1998 (Rel. 36, Created) 

DT 15-JUL-1998 (Rel. 36, Last sequence update) 

DT 15-JUL-1998 (Rel. 36, Last annotation update) 

DE Ileal sodium/bile acid cotransporter (Ileal Na(+)/bile acid 

DE cotransporter) (Na+ dependent ileal bile acid transporter) (Ileal 

DE sodium-dependent bile acid transporter) (ISBT) ( Sodium/ taurocholate 

DE cotransporting polypeptide, ileal) . 

GN SLC10A2 OR NTCP2 . 

OS Oryctolagus cuniculus (Rabbit) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Lagomorpha; Leporidae; Oryctolagus. 

OX NCBI_TaxID=9986; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=New Zealand white; TISSUE=Ileum; 

RA Stengelin S., Apel S., Becker W., Maier M. , Rosenberger J., Wess G., 

RA Kramer W. ; 

RL Submitted (OCT-1997) to the EMBL/ GenBank/ DDB J databases. 

CC -!- FUNCTION: Plays a critical role in the sodium-dependent 

CC reabsorption of bile acids from the lumen of the small intestine. 



CC -!- SUBCELLULAR LOCATION: Integral membrane protein. 

CC -!- SIMILARITY: BELONGS TO THE SODIUM: BILE ACID SYMPORTER FAMILY 

CC (SBF). 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; Z54357; CAA91184.1; 

DR EMBL; AJ002005; CAA05135.1; 

DR InterPro; IPR004710; Bil_ac_transpt . 

DR InterPro; IPR002657; BilAc/Na_symport . 

DR Pfam; PF01758; SBF; 1. 

DR TIGRFAMs; TIGR00841; bass; 1. 

KW Transmembrane; Transport; Symport; Sodium transport; Glycoprotein. 

FT TRANSMEM 30 

FT TRANSMEM 59 

FT TRANSMEM 83 

FT TRANSMEM 128 

FT TRANSMEM 159 

FT TRANSMEM 197 

FT TRANSMEM 226 

FT TRANSMEM 290 

FT CARBOHYD 3 

FT CARBOHYD 11 

SQ SEQUENCE 347 AA; 

Query Match 
Best Local Similarity 
Matches 164; Conservative 73; Mismatches 98; Indels 14; Gaps 5; 

CPANSS — EEELPVGLEVHGN — LELVFTWSTVMMGLLMFSLGCSVEIRKLWSHIRRPW 66 
|||:: | | | : I I : I : I I : : : I : I I I : I I : I I I : I I I I I I! 



|| : | I I I I I : I I I : : I I : : I : I : I I : I I II I I I I I I I III : I I I I I I I I I 



I I I I I I I : I I I I I I I I : I : I I I : I I I I I I : I I I : I I : I : : I I : : I 



I : : : I I I I I : I : : I I I : : :: I I I : I : : I : I I I I I : I = MM 



I I || I I : : M I I I I I : I I : : M II : I I : : M I I : I I 



50 




POTENTIAL. 




79 




POTENTIAL. 




103 




POTENTIAL. 




148 




POTENTIAL. 




179 




POTENTIAL. 




217 




POTENTIAL. 




246 




POTENTIAL. 




310 




POTENTIAL. 




3 




N-LINKED (GLCNAC. . 


, .) (POTENTIAL) 


11 




N-LINKED (GLCNAC. . 


, .) (POTENTIAL) 


37729 MW 


; 1315BB6BADDEE66C 


CRC64; 


44 


.8%; 


Score 886; DB 1; 


Length 347; 


47 


.0%; 


Pred. No. 9.4e-61; 





Qy 


n 


Db 


8 


QY 


67 


Db 


68 


Qy 


127 


Db 


128 


Qy 


187 


Db 


188 


Qy 


247 


Db 


248 


Qy 


307 



III:: : : I I : : : I : : I I 



Db 308 IYVAYRK CHGKNDAEFPDI KDTKTEPESSFHQMN--GGFQP 34 6 



RESULT 2 
NTCI_CRIGR 

ID NTCI_CRIGR STANDARD; PRT ; 348 AA. 

AC Q60414; 

DT 01-NOV-1997 (Rel. 35, Created) 

DT 01-NOV-1997 (Rel. 35, Last sequence update) 

DT 01-NOV-1997 (Rel. 35, Last annotation update) 

DE Ileal sodium/bile acid cotransporter (Ileal Na(+)/bile acid 

DE cotransporter) (Na+ dependent ileal bile acid transporter) (Ileal 

DE sodium-dependent bile acid transporter) (ISBT) ( Sodium/ taurocholate 

DE cotransporting polypeptide, ileal) . 

GN SLC10A2 OR NTCP2 . 

OS Cricetulus griseus (Chinese hamster) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Cricetinae; 

OC Cricetulus. 

OX NCBI_TaxID=10029; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Ileum; 

RX MEDLINE=94117449; PubMed-8288599 ; 

RA WongM.H., Oelkers P.M., Craddock A. L . , Dawson P. A.; 

RT "Expression cloning and characterization of the hamster ileal sodium- 

RT dependent bile acid transporter."; 

RL J. Biol. Chem. 269:134 0-1347(1994). 

CC -!- FUNCTION: Plays a critical role in the sodium-dependent 

CC reabsorption of bile acids from the lumen of the small intestine. 

CC -!- SUBCELLULAR LOCATION: Integral membrane protein. 

CC -!- SIMILARITY: BELONGS TO THE SODIUM:BILE ACID SYMPORTER FAMILY 

CC (SBF) . 

CC 7 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; U02028; AAA18640.1; -. 

DR InterPro; IPR004710; Bil_ac_transpt . 

DR InterPro; IPR002657; BilAc/Na_symport . 

DR Pfam; PF01758; SBF; 1. 

DR TIGRFAMs; TIGR00841; bass; 1. 

KW Transmembrane; Transport; Symport; Sodium transport; Glycoprotein. 

FT TRANSMEM 2 9 4 9 POTENTIAL. 

FT TRANSMEM 58 7 8 POTENTIAL. 

FT TRANSMEM 82 102 POTENTIAL. 

FT TRANSMEM 127 147 POTENTIAL. 

FT TRANSMEM 158 178 POTENTIAL. 

FT TRANSMEM 196 216 POTENTIAL. 

FT TRANSMEM 225 245 POTENTIAL. 

FT TRANSMEM 285 305 POTENTIAL. 

FT CARBOHYD 3 3 N-LINKED (GLCNAC. . .) (POTENTIAL). 



FT CARBOHYD 10 10 N-LINKED ( GLCNAC . . .) (POTENTIAL). 

FT CARBOHYD 318 318 N-LINKED (GLCNAC. . .) (POTENTIAL). 

SQ SEQUENCE 348 AA; 37919 MW; 1F3A1CFC9C8DFB8C CRC64; 

Query Match 44.7%; Score 884; DB 1; Length 348; 

Best Local Similarity 46.9%; Pred. No. 1.3e-60; 

Matches 164; Conservative 74; Mismatches 102; Indels 10; Gaps 4; 

Qy 7 SSSACPANSS — EEELPVGLEVHGN — LELVFTVVSTVMMGLLMFSLGCSVEIRKLWSHI 62 

: | | | | : : I : : I : I I : I : I I : : : I : I I I : I I : I I : I I : 

Db 3 NSSICNPNATICEGDSCIAPESNFNAILSWMSTVLTILLALVMFSMGCNVELHKFLGHL 62 

Qy 63 RRPWGIAVGLLCQFGLMPFTAYLLAISFSLKPVQAIAVLIMGCCPGGTISNIFTFWVDGD 122 

MMII || I I |||:|| I ::|:::| : MM: III III MM Ml Ml Ml 

Db 63 RRPWGIVVGFLCQFGIMPLTGFVLS VAFGI LPVQAVVVLIQGCCPGGTASNI LAYWVDGD 122 

Q y 123 MDLSISMTTCSTVAALGMMPLCIYLYTWSWSLQQNLTIPYQNIGITLVCLTIPVAFGVYV 182 

| | | | : || I I M I : II I I I M I : : M I I : I I I M I M I I I I M I M I 

Db 123 MDLSVSMTTCSTLLTUjGMMPLCLFIYTKMWVDSGTIVIPYDSIGTSLVALVI PVSIGMYV 182 

Qy 183 NYRWPKQSKIILKIGAWGGVLLLWAVAGWIAKGSWNSDITLLTISFIFPLIGHVTGF 242 

| : : | | : : : | I I I I I I : : I : I : : : M I I M : M : I I MM I • M 
Db 183 NHKWPQKAKI I LKI GS I AGAI LI VLI AWGGI LYQSAWT I EPKLWI I GTI YP I AGYGLGF 242 

Qy 243 LLALFTHQSWQRCRTISLETGAQNIQMCITMLQLSFTAEHLVQMLSFPLAYGLFQLIDGF 302 

|| | | I I I I : : | I I I I I I : I I : M I II : I I : MM I MM 

Db . 243 FLARIAGQPWYRCRTVALETGLQNTQLCSTIVQLSFSPEDLNLVFTFPLIYSIFQIAFAA 302 

Qy 303 LIVAAYQTYKRRLKNKHGKKNSGCTEVCHTRKS — TSSRETNAFLEVNEE 350 

: : : I I M : I I I I : I M MM : M : 

Db 303 I LLGAYVAYKK CHGKNNTELQEKTDNEMEPRSSFQETNKGFQPDEK 34 8 



RESULT 3 
NTCI_RAT 

ID NTCI_RAT STANDARD; PRT; 348 AA. 

AC Q62633; 

DT 01-NOV-1997 (Rel. 35, Created) 

DT 01-NOV-1997 (Rel. 35, Last sequence update) 

DT 01-NOV-1997 (Rel. 35, Last annotation update) 

DE Ileal sodium/bile acid cotransporter (Ileal Na(+)/bile acid 

DE cotransporter) (Na+ dependent ileal bile acid transporter) (Ileal 

DE sodium-dependent bile acid transporter) (ISBT) ( Sodium/taurocholate 

DE cotransporting polypeptide, ileal). 

GN SLC10A2 OR NTCP2 . 

OS Rattus norvegicus (Rat) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Rattus. 

OX NCBI_TaxID=10116; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=Sprague-Dawley; TISSUE=Ileum; 

RX MEDLINE=95164708; PubMed-7 8 607 56 ; 

RA Shneider B.L., Dawson P. A., Christie D.M., Hardikar W., Wong M.H., 

RA Suchy F. J. ; 

RT "Cloning and molecular characterization of the ontogeny of a rat 

RT ileal sodium-dependent bile acid transporter."; 



RL J. Clin. Invest. 95:745-754(1995). 

CC -!- FUNCTION: Plays a critical role in the sodium-dependent 

CC reabsorption of bile acids from the lumen of the small intestine. 

CC -!- SUBCELLULAR LOCATION: Integral membrane protein. 

CC -!- DEVELOPMENTAL STAGE: Transcriptionally regulated increases in mRNA 
CC and protein levels at the time of weaning. 

CC -!- SIMILARITY: BELONGS TO THE SODIUM: BILE ACID SYMPORTER FAMILY 
CC (SBF) . 

CC 7"*" 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 



CC 










DR 


EMBL; U071 


83; AAC53101.1; 




DR 


InterPro; 


IPR004710; Bil_ac_transpt . 


DR 


InterPro; 


IPR002657; BilAc/Na_symport . 


DR 


Pfam; PF01758; SBF; 


1. 




DR 


TIGRFAMs; 


TIGR00841; bass; 


1. 


KW 


Transmembrane; Transport; 


Symport; Sodium transport; Glycoprotein. 


FT 


TRANSMEM 


29 


49 


POTENTIAL. 


FT 


TRANSMEM 


58 


78 


POTENTIAL. 


FT 


TRANSMEM 


82 


102 


POTENTIAL. 


FT 


TRANSMEM 


127 


147 


POTENTIAL. 


FT 


TRANSMEM 


158 


178 


POTENTIAL. 


FT 


TRANSMEM 


196 


216 


POTENTIAL. 


FT 


TRANSMEM 


225 


245 


POTENTIAL. 


FT 


TRANSMEM 


285 


305 


POTENTIAL. 


FT 


CARBOHYD 


3 


3 


N-LINKED (GLCNAC. . . ) (POTENTIAL). 


FT 


CARBOHYD 


10 


10 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


SQ 


SEQUENCE 


348 AA; 


38024 


MW; D4C38CF136D1143B CRC64; 


Query Match 




44.0%; Score 871; DB 1; Length 348; 



Best Local Similarity 47.1%; Pred. No. 1.3e-59; 
Matches 165; Conservative 70; Mismatches 105; Indels 10; Gaps 3; 

S S SACPANS S EEELPVGLEVHGN LELVFT VVSTVMMGLLMFSLGCSVEI RKLWSHI 62 

: | | | | : : | I I I I : I I : : : : : I I I : I I : I I I I I I 



: | | | | I II llllhll I : : I : : : : MM: I I I I I I IN : I : I I I 

KRPWGIFVGFLCQFGIMPLTGFILSVASGILPVQAWVLIMGCCPGGTGSNILAYWIDGD 

MDLSISMTTCSTVAALGMMPLCIYLYTWSWSLQQNLTIPYQNIGITLVCLTIPVAFGVYV 

I I I I : I I I I I I I : I I I I I I I I ::: I I I : I I I : I I I : I I I III: I : = I 

MDLSVSMTTCSTLLALGMMPLCLFIYTKMWVDSGTIVIPYDSIGISLVALVIPVSIGMFV 

N YRW PKQSKIILKI GAWGGVL L L WAVAGWLAKG S WN SDITLLTISFIFPLI GHVT G F 

I :: I I : :: I I I I I I I : : I : I : ::: I I I : I : : I • I I I M : I : II 
NHKWPQKAKIILKIGSIAGAILIVLIAWGGILYQSAWIIEPKLWIIGTIFPIAGYSLGF 

LLALFTHQSWQRCRTISLETGAQNIQMCITMLQLSFTAEHLVQMLSFPLAYGLFQLIDGF 
II I I I I I I :: I I I I I I I : I I :: I I I I : I I : : I I I I : I I I s 



Qy 


7 


Db 


3 


Qy 


63 


Db 


63 


Qy 


123 


Db 


123 


Qy 


183 


Db 


183 


Qy 


243 



Db 243 FLARLAGQPWYRCRTVALETGMQNTQLCSTIVQLSFSPEDLNLVFTFPLIYTVFQLVFAA 302 



Qy 303 LIVAAYQTYKRRLKNKHGKKNSGCTEVCHTRKS-- TSSRETNAFLEVNEE 350 

: I : I I I I : III:: I 1=111 : : I : 

Db 303 IILGMYVTYKK CHGKNDAEFLEKTDNDMDPMPSFQETNKGFQPDEK 34 8 



RESULT 4 
NTCI_HUMAN 

ID NTCI_HUMAN STANDARD; PRT; 348 AA. 

AC Q12908; Q13839; 

DT 01-NOV-1997 (Rel. 35, Created) 

DT 01-NOV-1997 (Rel. 35, Last sequence update) 

DT 28-FEB-2003 (Rel. 41, Last annotation update) 

DE Ileal sodium/bile acid cotransporter (Ileal Na(+)/bile acid 

DE cotransporter) (Na+ dependent ileal bile acid transporter) (Ileal 

DE sodium-dependent bile acid transporter) (ISBT) ( Sodium/ taurocholate 

DE cotransporting polypeptide, ileal) . 

GN SLC10A2 OR NTCP2 . 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. , AND VARIANT CD SER-290. 

RC TISSUE^Ileum; 

RX MEDLINE=96070831; PubMed=7 592 981; 

RA Wong M.H., Oelkers P., Dawson P. A.; 

RT "Identification of a mutation in the ileal sodium-dependent bile acid 

RT transporter gene that abolishes transport activity."; 

RL J. Biol. Chem. 270:27228-27234(1995). 

RN [2] 

RP SEQUENCE FROM N.A., VARIANTS PBAM PRO-243 AND MET-262, AND VARIANT 

RP SER-171. 

RX MEDLINE-97263517; PubMed-9109432 ; 

RA Oelkers P., Kirby L.C., Heubi J.E., Dawson P. A. ; 

RT "Primary bile acid malabsorption caused by mutations in the ileal 

RT sodium-dependent bile acid transporter gene ( SLC10A2 ) . " ; 

RL J. Clin. Invest. 99:1880-1887(1997). 

RN [3] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=22269935; PubMed=12364586; 

RA Chumakov I., Blumenfeld M. , Guerassimenko O., Cavarec L., P.alicio M. , 

RA Abderrahim H . , Bougueleret L., Barry C, Tanaka H., La Rosa P., 

RA Puech A., Tahri N . , Cohen-Akenine A., Delabrosse S., Lissarrague S., 

RA Picard F.-P., Maurice K., Essioux L., Millasseau P., Grel P., 

RA Debailleul V., Simon A.-M., Caterina D., Dufaure I., Malekzadeh K. , 

RA Belova M. , Luan J. -J., Bouillot M. , Sambucy J.-L-, Primas G. , 

RA Saumier M. , Boubkiri N. , Martin-Saumier S., Nasroune M. , Peixoto H., 

RA Delaye A., Pinchot V., Bastucci M. , Guillou S-, Chevillon M. , 

RA Sainz-Fuertes R. , Meguenni S., Aurich-Costa J., Cherif D., Gimalac A. , 

RA Van Duijn C, Gauvreau D., Ouelette G., Fortier I., Realson J., 

RA Sherbatich T., Riazanskay N., Rogaev E. , Raeymaekers P., Aerssens J., 

RA Konings F., Luyten W., Macciardi F., Sham P.C., Straub R.E., 

RA Weinberger D.R., Cohen N . , Cohen D.; 

RT "Genetic and physiological data implicating the new human gene G72 and 

RT the gene for D-amino acid oxidase in schizophrenia."; 



RL Proc. Natl. Acad. Sci. U.S.A. 99:13675-13680(2002). 

RN [4] 

RP SEQUENCE FROM N.A. 

RA Smith M. ; 

RL Submitted (MAY-2001) to the EMBL/ GenBank/DDBJ databases. 

RN [5] 

RP SEQUENCE OF 1-22 FROM N.A. 

RC TISSUE=Blood; 

RA Stengelin S., Apel S., Becker W., Maier M . , Rosenberger J . , 

RA Kaufmann C, Wess G. , Kramer W. ; 

RL Submitted (OCT-1995) to the EMBL/ GenBank/DDBJ databases. 

CC -!- FUNCTION: Plays a critical role in the sodium-dependent 

CC reabsorption of bile acids from the lumen of the small intestine. 

CC -!- SUBCELLULAR LOCATION: Integral membrane protein. 

CC -!- DISEASE: Defects in SLC10A2 are a cause of primary bile acid 

CC malabsorption (PBAM) , an idiopathic intestinal disorder associated 

CC with congenital diarrhea, steatorrhea, interruption of the 

CC enterohepatic circulation of bile acids, and reduced plasma 

CC cholesterol levels. 

CC -!- DISEASE: Defects in SLC10A2 are a cause of Crohn's disease (CD). 

CC -!- SIMILARITY: BELONGS TO THE SODIUM: BILE ACID SYMPORTER FAMILY 
CC (SBF). 

CC 7~~ 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 



CC 






DR 


EMBL; 


U10417; AAC51870.1; -. 


DR 


EMBL; 


U67674; AAC95398.1; -. 


DR 


EMBL; 


U67669; AAC95398.1; JOINED. 


DR 


EMBL; 


U67670; AAC95398.1; JOINED. 


DR 


EMBL; 


U67671; AAC95398.1; JOINED. 


DR 


EMBL; 


U67672; AAC95398.1; JOINED. 


DR 


EMBL; 


U67673; AAC95398.1; JOINED. 


DR 


EMBL; 


AE014304; AAN16026.1; -. 


DR 


EMBL; 


AL161771; CAC39447.1; -. 


DR 


EMBL; 


Z54350; CAA91161.1; 


DR 


PIR; 


138655; 138655. 


DR 


Genew 


; HGNC: 10906; SLC10A2. 


DR 


MIM; 


601295; -. 



DR GO; GO: 0005887; C: integral to plasma membrane; TAS . 

DR GO; GO: 0008508; F:bile acid: sodium symporter activity; TAS. 

DR GO; GO: 0006810; P: transport; TAS. 

DR InterPro; IPR004710; Bil_ac_transpt . 

DR InterPro; IPR002657; BilAc/Na_symport . 

DR Pfam; PF01758; SBF; 1. 

DR TIGRFAMs; TIGR00841; bass; 1. 

KW Transmembrane; Transport; Symport; Sodium transport; Glycoprotein; 

KW Disease mutation; Polymorphism. 

FT TRANSMEM 29 49 POTENTIAL. 

FT TRANSMEM 58 7 8 POTENTIAL. 

FT TRANSMEM 82 102 POTENTIAL. 

FT TRANSMEM 127 147 POTENTIAL. 



TTrp 

r 1 


TRANSMF.M 


158 


178 


POTENTIAL. 


FT 


TP AM^MRM 


J_ J? u 




POTENTIAL. 


FT 


1 K/-UN ol v lIjl v i 


Z <L O 


041; 


POTENTIAL. 


FT 


rpDAM ClwfpM 

1 KAJN Oi v lEjl v l 


zoo 




POTENTIAL. 


FT 


CARbUri YD 


i n 
1 U 


1 n 


N-T.TNKFD I GLCNAC ) (POTENTIAL) . 


FT 


CARBOfil JJ 


jZ o 


O 


N-T.TNKFD ( GLCNAC ) (POTENTIAL) . 


FT 


VAKIAJN 1 


171 


171 
J. / ± 


A -> S . 


FT 








/RTTH=VAR 00461^ 


FT 


1 7"7\ T~\ T "A XTm 

VARIANT 






T — P ( TN PRAM- AROT.T SHF.S 


FT 








TATTROCHOT.ATF, TRANSPORT) 


FT 








/FTTH=VAR 004614 


FT 


VARIANT 


1 £ o 


0 £ 0 


rn y[ (tm PRAM* AROT.TSHFS 


TTT' 
C 1 








TAUROCHOLATE TRANSPORT) . 


FT 








/FTId=VAR 004615. 


FT 


VARIANT 


290 


290 


P -> S (IN CD; ABOLISHES TAUROCHOLATE 


FT 








TRANSPORT) . 


FT 








/FTId=VAR 004616. 


SQ 


SEQUENCE 


348 AA; 


37697 


MW; 159 90AAA91CCDB06 CRC64; 



Query Match 43.5%; Score 860.5; DB 1; Length 348; 

Best Local Similarity 45.6%; Pred. No. 8.4e-59; 

Matches 160; Conservative 68; Mismatches 104; Indels 19; Gaps 4; 

Qy 5 C S S S SACPANS S EEELPVGLEVHGNLELVFTVVSTVMMGLLMFS LGCSVEI RKLWSHI RR 64 

|| : I I : : I : I : I I : : : I : M I : I I : I I I = I I I : I 

Db 14 CSGASCWPESNFNNI LS WLSTVLT I LLALVMFSMGCNVEI KKFLGHI KR 64 

Qy 65 PWGIAVGLLCQFGLMPFTAYLLAISFSLKPVQAIAVLIMGCCPGGTISNIFTFWVDGDMD 124 

I I I I I I I I I I I : I I I :: I : :: I : I : I I : I II : I I I I I I I I I I : I I I I I I I 

Db 65 PWGICVGFLCQFGIMPLTGFILSVAFDILPLQAVWLIIGCCPGGTASNILAYWVDGDMD 124 

Qy 125 LSISMTTCSTVAALGMMPLCIYLYTWSWSLQQNLTIPYQNIGITLVCLTIPVAFGVYVNY 184 

I I : I I I I I I I : I I I I I I I I :: I I I : : I I I I I I : I I I : I I : I : : I I : 

Db 125 LSVSMTTCSTLLALGMMPLCLLIYTKMWVDSGSIVIPYDNIGTSLVALWPVSIGMFVNH 184 

Qy 185 RWPKQSKIILKIGAWGGVLLLWAVAGWIAKGSWNSDITLLTISFIFPLIGHVTGFLL 244 

: I I : :: I I I I I I I : : I : I : ::: I I I : I : : I I I I I I : I 2 I I I I 

Db 185 KWPQK7VKIILKIGSIAGAILIVLIAWGGILYQSAWIIAPKLWIIGTIFPVAGYSLGFLL 244 

Qy 245 ALFTHQSWQRCRTISLETGAQNIQMCITMLQLSFTAEHLVQMLSFPLAYGLFQLIDGFLI 304 

I I I I I I :: I I I I I I : I I I I I I I I I : : I I I I : I I I : 

Db 245 ARIAGLPWYRCRTVAFETGMQNTQLCSTIVQLSFTPEELNWFTFPLIYSIFQLAFAAIF 304 

Qy 305 VAAYQT YKRRLKNKHGKKNS GCTEVCHTRKST S S RETNAFLEVNEEGAIT P 355 

: I I I : I I I : I | : : : I : I I I 

Db 305 LGFYVAYKK CHGKNKAEIPE SKENGTEPESSFYKAN— GGFQP 345 



RESULT 5 
NTCP_RAT 

ID NTCP_RAT STANDARD; PRT; 362 AA. 

AC P26435; 

DT 01-AUG-1992 (Rel. 23, Created) 

DT 01-AUG-1992 (Rel. 23, Last sequence update) 

DT 15-MAR-2004 (Rel. 43, Last annotation update) 

DE Sodium/bile acid cotransporter (Na(+)/bile acid cotransporter ) 
DE (Na (+) /taurocholate transport protein) ( Sodium/ taurocholate 



DE 

GN 

OS 

OC 

OC 

OX 

RN 

RP 

RC 

RX 

RA 

RT 

RT 

RL 

CC 

CC 

CC 

CC 

CC 

CC 

CC 

CC 

CC 

CC 

CC 

CC 

CC 

CC 

CC 

CC 

CC 

DR 

DR 

DR 

DR 

DR 

DR 

KW 



cotransporting polypeptide) . 

SLC10A1 OR NTCP. 

Rattus norvegicus (Rat) . 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Rattus. 
NCBI_TaxID=10116; 
[1] 

SEQUENCE FROM N.A. 
TISSUE=Liver; 

MEDLINE=92073340; PubMed=1961729 ; 

Hagenbuch B., Stieger B., Foguet M. , Luebbert H., Meier P.J.; 
"Functional expression cloning and characterization of the hepatocyte 
Na+/bile acid cotransport system."; 

Proc. Natl. Acad. Sci. U.S.A. 88:10629-10633(1991). 

-!- FUNCTION: The hepatic sodium/bile acid uptake system exhibits 

broad substrate specificity and transports various nonbile acid 
organic compounds as well. It is strictly dependent on the 
extracellular presence of sodium. 

SUBCELLULAR LOCATION: Integral membrane protein. 
TISSUE SPECIFICITY: Liver and kidney. 

SIMILARITY: BELONGS TO THE SODIUM: BILE ACID SYMPORTER FAMILY 
(SBF) . 

This SWISS-PROT entry is copyright. It is produced through a collaboration 
between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 
the European Bioinf ormatics Institute. There are no restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
or send an email to license@isb-sib . ch) . 

EMBL; M77479; AAA42112.1; -. 
PIR; A41601; A41601. 

InterPro; IPR004710; Bil_ac_transpt . 
InterPro; IPR002657; BilAc/Na_symport . 
Pfam; PF01758; SBF; 1. 
TIGRFAMs; TIGR00841; bass; 1. 

Transmembrane; Transport; Symport; Sodium transport; Glycoprotein. 



FT 


TRANSMEM 


24 


45 


POTENTIAL. 




FT 


TRANSMEM 


60 


80 


POTENTIAL. 




FT 


TRANSMEM 


82 


98 


POTENTIAL. 




FT 


TRANSMEM 


158 


178 


POTENTIAL. 




FT 


TRANSMEM 


190 


211 


POTENTIAL. 




FT 


TRANSMEM 


228 


244 


POTENTIAL. 




FT 


TRANSMEM 


285 


306 


POTENTIAL. 




FT 


CARBOHYD 


5 


5 


N-LINKED (GLCNAC. . 


. ) (POTENTIAL) . 


FT 


CARBOHYD 


11 


11 


N-LINKED (GLCNAC. . 


. ) (POTENTIAL) . 


FT 


CARBOHYD 


103 


103 


N-LINKED (GLCNAC. . 


.) (POTENTIAL). 


FT 


CARBOHYD 


117 


117 


N-LINKED (GLCNAC. . 


.) (POTENTIAL). 


FT 


CARBOHYD 


271 


271 


N-LINKED (GLCNAC. . 


.) (POTENTIAL). 


SQ 


SEQUENCE 


362 AA; 


39295 MW; 


F0ABB76076A57550 


CRC64; 


Query Match 




28.3%; 


Score 559.5; DB 1; 


Length 362; 



Best Local Similarity 37.2%; Pred. No. 9.3e-36; 
Matches 133; Conservative 69; Mismatches 135; Indels 21; Gaps 9; 



10 ACPANSSEEELPVGLEVHGNLELVFTWSTVMMGLLMFSLGCSVEIRKLWSHIRRPWGIA 69 



Db 



: | I I II I I : • • = : | : | : | | | | | : : I | : : I : : I I : 
7 SAPFNFS LPPGFG-HRATDKALSIILVLMLLLIMLSLGCTMEFSKIKAHLWKPKGVI 62 



Qy 70 VGLLCQFGLMPFTAYLLAISFSLKPVQAIAVLIMGCCPGGTISNIFTFWVDGDMDLSISM 129 

I I : I I I : M | : I I II : : I : I : I I I I I I I : I I : I I : I I I : I I I I 

D b 63 VALVAQFGIMPLAAFLLGKIFHLSNIEALAILICGCSPGGNLSNLFTIAMKGDMNLSIVM 122 

Qy 130 TTCSTVAALGMMPLCIYLYT W S W S LQQN LT I P YQN I G I T LVC LT I P VAFGVYVN YRW 186 

I I I I : : I I I I I I I : I : I : = I : : I I : I I : I I : I I I : : : 

D b 123 TTCSSFSALGMMPLLLYVYSKGIYDGDLKDK — VPYKGIMISLVIVLIPCTIGIVLKSKR 180 

Qy 187 PKQSKIILKIGAWGGVLLLWAVAGVVLAKGSWNSDIT--LLTISFIFPLIGHVTGFLL 244 

| MM:: : I : I I : I : I II I : I I : I = s I 

D b 181 PHYVPYILKGGMIITFLLSVAVTALSVINVGNSIMFVMTPHLLATSSLMPFSGFLMGYIL 240 

Q y 245 - AL FT HQ S WQ RC - RT I S L ET G AQN I QMC I TMLQ L S FT AEH L VQML S F P LAYGL FQ L I D G F 302 

Ml I | I I M : I I I I I I I : I I : I :: I I : : I I I I : I I I = I 

Db 241 SALF— QLNPSCRRTISMETGFQNIQLCSTILNVTFPPEVIGPLFFFPLLYMIFQLAEGL 298 

Qy 303 LIVAAYQTYKRRLKNKHGKKNSGCTEVCHTRKSTSSRETNAFLEVNEEGAITPGPPGP 360 

| | : : : | : : I I : : : : I I : I I I II I 

Db 299 LIIIIFRCYEKI KPPKDQTKITYKAAATEDATPAALEKGTHNGNIPPLQPGP 350 



RESULT 6 
NTCP_HUMAN 

ID NTCP_HUMAN STANDARD; PRT; 34 9 AA. 

AC Q14973; 

DT 01-NOV-1997 (Rel. 35, Created) 

DT 01-NOV-1997 (Rel. 35, Last sequence update) 

DT 15-MAR-2004 (Rel. 43, Last annotation update) 

DE Sodium/bile acid cotransporter (Na(+)/bile acid cotransporter) 

DE (Na (+) /taurocholate transport protein) ( Sodium/ taurocholate 

DE cotransporting polypeptide) . 

GN SLC10A1 OR NTCP. 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Liver; 

RX MEDLINE=94179485; PubMed=8132774 ; 

RA Hagenbuch B., Meier P. J.; 

RT "Molecular cloning, chromosomal localization, and functional 

RT characterization of a human liver Na+/bile acid cotransporter."; 

RL J. Clin. Invest. 93:1326-1331(1994). 

CC -!- FUNCTION: The hepatic sodium/bile acid uptake system exhibits 
CC broad substrate specificity and transports various nonbile acid 

CC organic compounds as well. It is strictly dependent on the 

CC extracellular presence of sodium. 

CC -!- SUBCELLULAR LOCATION: Integral membrane protein. 

CC -!- SIMILARITY: BELONGS TO THE SODIUM:BILE ACID SYMPORTER FAMILY 

CC (SBF) . 

CC 7"" 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 



CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; L21893; AAA36381.1; 

DR PIR; 155601; 155601. 

DR Genew; HGNC: 10905; SLClOAl. 

DR MIM; 182396; -. 

DR GO; GO: 0005887; C: integral to plasma membrane; TAS . 

DR GO; GO: 0008508; F:bile acid: sodium symporter activity; TAS. 

DR GO; GO: 0006810; P: transport; TAS. 

DR InterPro; IPR004710; Bil_ac_transpt . 

DR InterPro; IPR002657; BilAc/Na_symport . 

DR Pfam; PF01758; SBF; 1. 



DR 


TIGRFAMs; 


TIGR00841 


; bass; 


1. 


KW 


Transmembrane; Transport; 


Symport; Sodium transport; Glycoprotein. 


FT 


TRANSMEM 


25 


45 


POTENTIAL. 


FT 


TRANSMEM 


60 


80 


POTENTIAL. 


FT 


TRANSMEM 


91 


111 


POTENTIAL. 


FT 


TRANSMEM 


120 


140 


POTENTIAL. 


FT 


TRANSMEM 


156 


176 


POTENTIAL . 


FT 


TRANSMEM 


191 


211 


POTENTIAL. 


FT 


TRANSMEM 


220 


240 


POTENTIAL. 


FT 


TRANSMEM 


283 


303 


POTENTIAL. 


FT 


CARBOHYD 


5 


5 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


11 


11 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


117 


117 


N-LINKED (GLCNAC. . ■) (POTENTIAL). 


FT 


CARBOHYD 


336 


336 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


SQ 


SEQUENCE 


349 AA; 


38119 


MW; F3AB2CC2FBD925E3 CRC64; 


Query Match 




27.9%; Score 553; DB 1; Length 349; 



Best Local Similarity 36.0%; Pred. No. 2.8e-35; 
Matches 124; Conservative 77; Mismatches 109; Indels 34; Gaps 10, 

Qy 31 ELVFTWSTVMMGLLMFSLGCSVEIRKLWSHIRRPWGIAVGLLCQFGLMPFTAYLLAISF 90 

: | : | : | : : I I I I I : : I | : : I : : I I : I : I : I : I : I I I I : : I I 

D b 24 DI^SVILVFMLFFIMLSLGCTMEFSKIKAHLWKPKGLAIALVAQYGIMPLTAFVLGKVF 83 

Qy 91 SLKPVQAIAVLIMGCCPGGTISNIFT FWVDGDMDLSI SMTTCSTVAALGMMPLCIYLYT- 149 

M : : | : | : I : II III : I I : I : : 111 = 111 I I I I I I I I I I I I I : I : I : 

Db 84 RLKNIEAIJVILVCGCSPGGNLSNVFSLAMKGDMNLSIVMTTCSTFCALGMMPLLLYIYSR 143 

Qy 150 — WSWSLQQNLTIPYQNIGITLVCLTIPVAFGVYVNYRWPKQSKIILKIGAWGGVLLLV 207 

: | : : I I : I I : I I : I I I : : : I : : : : I I : : : I : 

Db 144 GIYDGDLKDK— VPYKGIVISLVLVLIPCTIGIVLKSKRPQYMRYVIKGGMII ILL 197 

Qy 208 VAVAG WLAKG S WN S D I TLLTI S FI FPLIGHVTGFLL-ALFTHQSWQRC-RTI S 259 

: | | ||: : | . I : I : I I I : I : : I I I I I I I I : I 

Db 198 CSVAVTVLSAINVGKSIMFAMTPLLIATSSLMPFIGFLLGYVLSALFCLNG— RCRRTVS 255 

Qy 260 LETGAQNIQMCITMLQLSFTAEHLVQMLSFPLAYGLFQLIDGFLIVAAYQTYKRRLKNKH 319 

:||| M:|:| 1:1 ::| l' : : I I I I : I I I : I I = = I : I : : I 

Db 256 METGCQNVQLCSTILNVAFPPEVIGPLFFFPLLYMI FQLGEGLLLIAIFWCYE-KFKTPK 314 



QY 



320 GKKNSGCTEVCHTRKSTSSRETNAFLEVNEEGAITPGPPGPMDC 363 



I I ... I . I I 11*1 II 

D b 315 DK TKMIYTAATT EETI PGALGNGTYKGEDC 344 

RESULT 7 
NTCP_MOUSE 

ID NTCP_MOUSE STANDARD; PRT; 362 AA. 

AC 008705; 

DT 01-NOV-1997 (Rel. 35, Created) 

DT 01-NOV-1997 (Rel. 35, Last sequence update) 

DT 15-MAR-2004 (Rel. 43, Last annotation update) 

DE Sodium/bile acid cotransporter (Na(+)/bile acid cotransporter ) 

DE (Na (+) /taurocholate transport protein) ( Sodium/ taurocholate 

DE cotransporting polypeptide) . 

GN SLC10A1 OR NTCP . 

OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 

OX NCBI_TaxID=10090; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=BALB/c; TISSUE^Liver ; 

RA Saeki T. ; 

RL Submitted (MAY-1997) to the EMBL/ GenBank/DDBJ databases. 

RN [2] 

RP SEQUENCE FROM N.A. 

RC STRAIN=BALB/c; 

RA Hagenbuch B.; 

RL Submitted (OCT-1997) to the EMBL/ GenBank/DDBJ databases. 

CC -!- FUNCTION: The hepatic sodium/bile acid uptake system exhibits 

CC broad substrate specificity and transports various nonbile acid 

CC organic compounds as well. It is strictly dependent on the 

CC extracellular presence of sodium. 

CC -!- SUBCELLULAR LOCATION: Integral membrane protein. 

CC -!- SIMILARITY: BELONGS TO THE SODIUM: BILE ACID SYMPORTER FAMILY 

CC (SBF). 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; AB003303; BAA19846.1; -. 

DR EMBL; U95131; AAB81023 . 1 ; - . 

DR MGD; MGI: 97379; SlclOal. 

DR InterPro; IPR004710; Bil_ac_transpt . 

DR InterPro; IPR002657; BilAc/Na__symport . 

DR Pfam; PF01758; SBF; 1. 

DR TIGRFAMs; TIGR00841; bass; 1. 

KW Transmembrane; Transport; Symport; Sodium transport; Glycoprotein. 

FT TRANSMEM 24 45 POTENTIAL. 

FT TRANSMEM 60 80 POTENTIAL . 

FT TRANSMEM 82 98 POTENTIAL. 

FT TRANSMEM 158 178 POTENTIAL. 



FT 




i y u 


9 11 


POTFNTT AT, 




FT 


ryinflM CM™ 
i KMJN Oi v llIil v l 


nop 


9 /I /I 
Z fl 4 


POTFNTTAT. 




TTTP 

r 1 


1 K/ATJ OlYlILiYl. 


9 ft ^ 
ZOO 




POTFNTTAT. 




FT 


CARBOHYD 


5 


5 


N-LINKED (GLCNAC. . 


. ) (POTENTIAL) . 


FT 


CARBOHYD 


11 


11 


N-LINKED (GLCNAC. . 


. ) (POTENTIAL) . 


FT 


CARBOHYD 


103 


103 


N-LINKED (GLCNAC. . 


. ) (POTENTIAL) . 


FT 


CARBOHYD 


117 


117 


N-LINKED (GLCNAC. . 


. ) (POTENTIAL) . 


FT 


CARBOHYD 


271 


271 


N-LINKED (GLCNAC. . 


. ) (POTENTIAL) . 


SQ 


SEQUENCE 


362 AA; 


39413 


MW; 7A704 93E1804280F 


CRC64 ; 


Query Match 




27. 9 £ 


h; Score 553; DB 1; 


Length 362; 



Best Local Similarity 35.1%; Pred. No. 2.9e-35; 

Matches 129; Conservative 72; Mismatches 142; Indels 24; Gaps 8; 

Qy 10 ACPANSSEEELPVGLEVHGNLELVFTWSTVMMGLLMFSLGCSVEIRKLWSHIRRPWGIA 69 

: I I I III I : : I : I I : I : I I I I I : : I I : : I : I I : 

Db 7 SAP FNFS LPPGFG-HRATDTALSVILVVMLLLIMLSLGCTMEFSKIKMFWKPKGVI 62 

Qy 70 VGLLCQFGLMP FTAYLLAI S FSLKPVQAI AVLIMGCCPGGT I SN I FT FWVDGDMDLS I SM 129 

: : : I : I : I I : I : I I II : : I : I : I I I I I I I : I I : I I : I I I : I I I I 
Db 63 IAIVAQYGIMPLSAFLLGKVFHLTSII^LAILICGCSPGGNLSNLFTLAMKGDMNLSIVM 122 

Qy 130 TTCSTVAALGMMPLCIYLYT WSWS LQQNLTI P YQNI GITLVCLT I PVAFGVYVNYRW 186 

MM: I I I I I I I : I : I : : I : : | | : | : : I I : I I I I : : : : 
Db 123 TTCSSFTALGMMPLLLYIYSKGIYDGDLKDK— VPYKGIMLSLVMVLIPCAIGIFLKSKR 180 

Qy 187 P KQ S K 1 1 L K I GAWGGVL L L VVAVAGWLAKG S WN S D I T — LLTISFIFPLIGHVTGFLL 244 

I : I I I : : Mil I : I : I II I : I I : I : : I 

Db 181 PHWPYVT.KAGMIITFSLSVAWVLSVINVGNSIMFVMTPHLLATSSLMPFTGFLMGYIL 240 

Qy 245 ALFTHQSWQRCRTISLETGAQNIQMCITMLQLSFTAEHLVQMLSFPLAYGLFQLIDGFLI 304 

: : I I I I : I I I I I : I : I I : I : : I I : : I ! I I : I I I : I I 

Db 241 SALFRLNPSCRRTISMETGFQNVQLCSTILNVTFPPEVIGPLFFFPLLYMIFQLAEGLLF 300 

Qy 305 VAAYQTYKRRLKNKHGKKNSGCTEVCHTRKSTSSRETNAFLEVNEEGAITPGPPGPMDCH 364 

: : : I I I I I I : : : : I I : I I I I 
Db 301 IIIFRCY LKIKPQKDQ TKITYKAAATEDATPAALEKGTHNGNNPPTQPG 349 

Qy 365 RALEPVG 371 

III 

Db 350 — LSPNG 354 



RESULT 8 
P3_HUMAN 

ID P3_HUMAN STANDARD; PRT; 477 AA. 

AC P09131; 

DT 01-MAR-1989 (Rel. 10, Created) 

DT 01-MAR-1989 (Rel. 10, Last sequence update) 

DT 10-OCT-2003 (Rel. 42, Last annotation update) 

DE P3 protein. 

GN SLC10A3 OR P3. 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606; 

RN [1] 



RP SEQUENCE FROM N . A. 

RX MEDLINE=89041548; PubMed=3 186440 ; 

RA Alcalay M. , Toniolo D.; 

RT "CpG islands of the X chromosome are gene associated."; 

RL Nucleic Acids Res. 16:9527-9556(1988). 

RN [2] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=96311563; PubMed=8733135 ; 

RA Chen E.Y., Zollo M. , Mazzarella R.A. , Ciccodicola A., Chen C.-N., 

RA Zuo L., Heiner C, Burough F.W., Ripetto M., Schlessinger D., 

RA D f Urso M. ; 

RT "Long-range sequence analysis in Xq28: thirteen known and six 

RT candidate genes in 219.4 kb of high GC DNA between the RCP/GCP and 

RT G6PD loci. "; 

RL Hum. Mol. Genet. 5:659-668(1996). 

CC -!- FUNCTION: The ubiquitous expression and the conservation of the 
CC sequence in distant animal species suggest that the gene codes for 

CC a protein with housekeeping functions. 

CC -!- SUBCELLULAR LOCATION: Integral membrane protein (Probable). 

CC -!- SIMILARITY: TO P3 PROTEIN OF ANIMALS AND YEASTS. 

CC -!- SIMILARITY: BELONGS TO THE SODIUM: BILE ACID SYMPORTER FAMILY 
CC (SBF) . 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; X12458; CAA30998.1; -. 

DR EMBL; L44140; AAA92651.1; -. 

DR PIR; S01696; S01696. 

DR Genew; HGNC: 22979; SLC10A3. 

DR MIM; 312090; -. 

DR GO; GO: 0016021; C: integral to membrane; NAS . 

DR GO; GO: 0008508; F:bile acid: sodium symporter activity; NAS. 

DR GO; GO: 0006814; P: sodium ion transport; NAS. 

DR InterPro; IPR004710; Bil_ac_transpt . 

DR InterPro; IPR002657; BilAc/Na_symport . 

DR Pfam; PF01758; SBF; 1. 

DR TIGRFAMs; TIGR00841; bass; 1. 

KW Transmembrane; Transport; Symport. 

SQ SEQUENCE 477 AA; 50332 MW; 4 9CB363EB3B66A1D CRC64; 

Query Match 16.9%; Score 333.5; DB 1; Length 477; 

Best Local Similarity 31.5%; Pred. No. 2.4e-18; 

Matches 87; Conservative 53; Mismatches 115; Indels 21; Gaps 3; 

Qy 12 PANSSEEELPVGLEvllGNLELVFTWSTViyiMGLLMFSLGCSVEIRKLWSHIRRPWGIAVG 71 

||: I I ::::::: I I I I I : I : : I : : I 

Db 172 PAEDTPATLSADLAHFSENPILYLLLPLIFVNKCSF — GCKVELEVLKGLMQSPQPMLLG 22 9 

Qy 72 LLCQFGLMPFTAYLLAISFSLKPVQAIAVLIMGCCPGGTISNIFTFWVDGDMDLSISMTT 131 

I I I I : I I I : I : I II I : : : I III I : I : : I I : I : I I I I 

Db 230 LLGQFLWPLYAFLMAKVFMLPKALALGLIITCSSPGGGGSYLFSLLLGGDVTLAISMTF 289 



Qy 132 CSTVAALGMMPLCIYLYTWSWSLQQNLTIPYQNIGITLVCLTIPVAFGVYVNYRWPKQSK 191 

I I I I I I : I I : | : I : : I : I I I I : : I I : I I I : : I I I : 
Db 290 LSTVAATGFLPLSSAIYSRLLSIHETLHVPISKILGTLLFIAIPIAVGVLIKSKLPKFSQ 349 

Qy 192 IILKIGAWGGVLLL WAVAG WLAK G S WN SDITLLTISFIFPLI GH VT G 241 

::|:: III I I :||: I :: : MM : I 

Db 350 LLLQWKP FS FVLLLGGLFLAYRMGVFI LAGI RL PIVLVGITVPLVGLLVG 400 

Qy 242 FL LAL FT HQ S WQRC RT I S L ET GAQN I QMC I TMLQ L S 277 

: | | : I I : I : I I I I : : I I I I I 

Db 4 01 YC LAT CL KL P VAQ RRT VS I E VGVQN S L LALAMLQ L S 436 



RESULT 9 
P3_M0USE 

ID P3_MOUSE STANDARD; PRT; 182 AA. 

AC P21129; 

DT 01-FEB-1991 (Rel. 17, Created) 

DT 01-FEB-1991 (Rel. 17, Last sequence update) * 

DT 10-OCT-2003 (Rel. 42, Last annotation update) 

DE P3 protein (Fragment) . 

GN SLC10A3 OR P3. 

OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 

OX NCBI_TaxID=10090; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN-BALB/ c; TISSUE-Liver ; 

RX MEDLINE=90307023; PubMed=1973144 ; 

RA Filippi M., Tribioli C. , Toniolo D.; 

RT "Linkage and sequence conservation of the X-linked genes DXS253E (P3) 

RT and DXS254E (GdX) in mouse and man."; 

RL Genomics 7:453-457(1990). 

CC -!- FUNCTION: The ubiquitous expression and the conservation of the 
CC sequence in distant animal species suggest that the gene codes for 

CC a protein with housekeeping functions. 

CC -!- SUBCELLULAR LOCATION: Integral membrane protein (Probable). 

CC -!- SIMILARITY: BELONGS TO THE SODIUM:BILE ACID SYMPORTER FAMILY 
CC (SBF) . 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; J04761; AAA40519.1; 

DR PIR; 154222; 154222. 

DR InterPro; IPR002657; BilAc/Na_symport . 

DR Pfam; PF01758; SBF; 1. 

KW Transmembrane; Transport; Symport. 

FT N0N_TER 1 1 

SQ SEQUENCE 182 AA; 19629 MW; 472D732 820CDD620 CRC64; 



Query Match 9.1%; Score 180.5; DB 1; 

Best Local Similarity 30.2%; Pred. No. 4.6e-07; 
Matches 45; Conservative 31; Mismatches 54; 



Length 182; 

Indels 19; Gaps 



2; 



Qy 139 GMMPLCIYLYTWSWSLQQNLTIPYQNIGITLVCLTIPVAFGVYVNYRWPKQSKIILKIGA 198 

| : I I : | : : I : : I : I I I I : : I I : I I I : : I I I : : : I : : 
Db 2 GFLPLSSAIYSYLLSIHETLHVPISKILGTLLFIAIPIAAGWIKSKLPKFSELLLQVIK 61 

Qy 199 WGGVLLL WAVAGWLAKGSWNSDITLLTI SFIFPLIGHVTGFLLALFT 248 

: I I I I : I I I : : : I I I : I : I : I I : 

D b 62 PFSFILLLGGLFLAYHMGVFILVGVRL P I VLVG FT VP LVGL LVG Y S LAI C L 112 

Qy 249 HQ S WQ RC RT I S LET GAQN I QMC I TMLQ L S 27 7 

: I I : I : I I I I : : I I I I I 

Db 113 KLPVAQRRTVSIEVGVQNSLLALAMLQLS 141 



RESULT 10 
YCXA_BACSU 

ID YCXA_BACSU STANDARD ; PRT; 409 AA. 

AC Q08791; 

DT 01-FEB-1995 (Rel. 31, Created) 

DT 01-FEB-1995 (Rel. 31, Last sequence update) 

DT 10-OCT-2003 (Rel. 42, Last annotation update) 

DE Hypothetical protein ycxA (ORF5) . 

GN YCXA OR BSU03530. 

OS Bacillus subtilis. 

OC Bacteria; Firmicutes; Bacillales; Bacillaceae; Bacillus. 

OX NCBI_TaxID=1423; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=168 / JH642; 

RX MEDLINE=93360813; PubMed=8355609 ; 

RA Cosmina P., Rodriguez F., de Ferra F. , Grandi G., Perego M. , 

RA Venema G., van Sinderen D. ; 

RT "Sequence and analysis of the genetic locus responsible for surfactin 

RT synthesis in Bacillus subtilis."; 

RL Mol. Microbiol. 8:821-831(1993). 

RN [2] 

RP SEQUENCE FROM N.A. 

RC STRAIN-168; 

RX MEDLINE=97124189; PubMed=8 969502 ; 

RA Yamane K. , Kumano M. , Kurita K. ; 

RT "The 25 degrees-36 degrees region of the Bacillus subtilis chromosome: 

RT determination of the sequence of a 14 6 kb segment and identification 

RT of 113 genes . " ; 

RL Microbiology 142:3047-3056(1996). 
RN [3] 

RP SEQUENCE FROM N.A. 

RC STRAIN=168; 

RX MEDLINE=98044033; PubMed=9384 377 ; 

RA Kunst F . , Ogasawara N . , Moszer I., Albertini A.M., Alloni G., 

RA Azevedo V., Bertero M.G., Bessieres P., Bolotin A., Borchert S., 

RA Borriss R. , Boursier L., Brans A., Braun M., Brignell S.C., Bron S., 

RA Brouillet S., Bruschi C.V., Caldwell B., Capuano V., Carter N.M., 

RA Choi S.K., Codani J. J., Connerton I.F., Cummings N.J., Daniel R.A. , 



RA Denizot F., Devine K.M. , Dusterhoft A., Ehrlich S.D., Emmerson P.T., 

RA Entian K.D., Errington J., Fabret C, Ferrari E. , Foulger D., 

RA Fritz C, Fujita M., Fujita Y., Fuma S., Galizzi A., Galleron N., 

RA Ghim S.Y., Glaser P., Goffeau A. , Golightly E.J., Grandi G., 

RA Guiseppi G. , Guy B.J., Haga K., Haiech J., Harwood C.R., Henaut A. , 

RA Hilbert H., Holsappel S., Hosono S., Hullo M.F., Itaya M., Jones L . , 

RA Joris B., Karamata D., Kasahara Y., Klaerr-Blanchard M. , Klein C, 

RA Kobayashi Y., Koetter P., Koningstein G. , Krogh S., Kumano M. , 

RA Kurita K., Lapidus A. , Lardinois S., Lauber J., Lazarevic V. , 

RA Lee S.M., Levine A. r Liu H., Masuda S., Mauel C, Medigue C, 

RA Medina N., Mellado R.P., Mizuno M. , Moestl D., Nakai S., Noback M. , 

RA Noone D., O'Reilly M. , Ogawa K. , Ogiwara A. , Oudega B., Park S.H., 

RA Parro V., Pohl T.M., Portetelle D., Porwollik S., Prescott A.M. , 

RA Presecan E . , Pujic P., Purnelle B., Rapoport G., Rey M. , Reynolds S., 

RA Rieger M. , Rivolta C, Rocha E., Roche B. A Rose M. , Sadaie Y. , 

RA Sato T., Scanlan E., Schleich S., Schroeter R. , Scoffone F., 

RA Sekiguchi J., Sekowska A. , Seror Serror P., Shin B.S., Soldo B., 

RA Sorokin A. , Tacconi E., Takagi T . , Takahashi H., Takemaru K. r 

RA Takeuchi M. , Tamakoshi A. r Tanaka T., Terpstra P., Tognoni A., 

RA Tosato V., Uchiyama S. f Vandenbol M. , Vannier F., Vassarotti A., 

RA Viari A. r Wambutt R. , Wedler E., Wedler H. f Weitzenegger T . , 

RA Winters P., Wipat A., Yamarnoto H., Yamane K. , Yasumoto K. , Yata K. , 

RA Yoshida K., Yoshikawa H.F., Zumstein E. , Yoshikawa H., Danchin A. ; 

RT "The complete genome sequence of the Gram-positive bacterium Bacillus 

RT subtilis."; 

RL Nature 390:249-256(1997). 

CC -!- SUBCELLULAR LOCATION: Integral membrane protein (Potential). 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; X70356; CAA49820.1; 

DR EMBL; D50453; BAA08987.1; -. 

DR EMBL; Z99105; CAB12147.1; -. 

DR PIR; 140489; 140489. 

DR SubtiList; BG10172; ycxA. 

DR InterPro; IPR007114; MFS . 

DR InterPro; IPR005828; Sub_transporter . 

DR Pfam; PF00083; sugar_tr; 1. 

DR PROSITE; PS50850; MFS; 1. 

KW Hypothetical protein; Transmembrane; Complete proteome. 



FT 


TRANSMEM 


9 


29 


POTENTIAL. 


FT 


TRANSMEM 


49 


69 


POTENTIAL . 


FT 


TRANSMEM 


77 


97 


POTENTIAL. 


FT 


TRANSMEM 


100 


120 


POTENTIAL. 


FT 


TRANSMEM 


135 


155 


POTENTIAL . 


FT 


TRANSMEM 


168 


188 


POTENTIAL. 


FT 


TRANSMEM 


217 


237 


POTENTIAL. 


FT 


TRANSMEM 


253 


273 


POTENTIAL. 


FT 


TRANSMEM 


284 


304 


POTENTIAL. 


FT 


TRANSMEM 


309 


329 


POTENTIAL. 


FT 


TRANSMEM 


341 


361 


POTENTIAL. 



FT TRANSMEM 374 394 POTENTIAL. 

SQ SEQUENCE 409 AA; 44858 MW; 8958A43E87E29DD3 CRC64; 



Query Match 6.3%; Score 125; DB 1; Length 4 09; 

Best Local Similarity 20.9%; Pred. No. 0.018; 

Matches 78; Conservative 61; Mismatches 122; Indels 112; Gaps 19; 

Qy 20 LPVGLEVHGNLELVFTVVSTVMM — GLLMFSLGCSVE IRKLWSHIRRPWGIAVGLLC 74 

II: I : I : II I : I : : I : I : : : I I : : I : I : I 

Db 34 LPMADAFHADRSLI SVSVSI FMITTGIVQFFVGFFI DRFSVRKI MALGAVC 84 

Qy 75 QFGLMPFTAYLLAISFSLKPVQAIAVLIMG CCPGGTISNIFTFWVDGDMDLSI 127 

I : : I : : : I I I : : I III : I I I : : 

Db 85 ISASFLVLPYSPNVHVFS AI YGVLGGI GYS CAVGVTTQ YFI S CWFDTHKGLAL 137 

Qy 12 8 SMTTCSTVAALGMMPLCIYL YTWSWSLQQNLTIPYQNIGITLVCLTIP-VAFG 17 9 

: : I : I I : I I : I II II : | | : : : | : | I 

Db 138 AI LTNAN S AGLWS P P P I WAAAP YHAGW — Q S T YT I L G I VMAAVLVP L LVF GMKH P 191 

Qy 180 VYVNYRW PKQS KI — I LKI GAWGGVLLLW AV 210 

I : I I M I :: I I I I : : : I 

Db 192 P HAQAET VK K S Y DWRG FWN VMKQ S RL I H I L Y F GVFT C G FTMG 1 1 D AH L VP ILK D AHVS H V 251 

Qy 211 AGWLAKGS WNSDI TLLTISFIFPLIGHVTGFLLALFTHQS— W- 252 

I : : I I : III: : I : I I I : : : I I I I 

Db 252 NGMMAAFGAFIIIGGLLAGWLSDLLGSRSVMLSILFFIRLLSLICLLIPILGIHHSDLWY 311 

Qy 253 Q RC RT I S L ET GAQN I QMC I TMLQ L S FT AEH L VQML S F P LAYG 294 

: I : I I : : : | : : | : I I 

Db 312 FGFILLFGLSYTGVIPLTAASISESYQTG LIGSLLGINFFIHQVAGALSVYAGGL 366 

Qy 295 LFQLIDGFLIVAA 307 

I : I : I : : I 
Db 367 FFDMTHGYLLIVA 379 



RESULT 11 
YJIY_ECOLI 

ID YJIY_ECOLI STANDARD; PRT; 721 AA. 

AC P39396; 

DT 01-FEB-1995 (Rel. 31, Created) 

DT 01-FEB-1995 (Rel. 31, Last sequence update) 

DT 28-FEB-2003 (Rel. 41, Last annotation update) 

DE Hypothetical protein yjiY. 

GN YJIY OR B4354. 

OS Escherichia coli . 

OC Bacteria ; Proteobacteria ; Gammaproteobacteria ; Enterobacteriales ; 

OC Enterobacteriaceae; Escherichia. 

OX NCBI_TaxID=562 ; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=K12 / MG1655; 

RX MEDLINE=95334362; PubMed=7610040 ; 

RA Burland V.D., Plunkett G. Ill, Sofia H.J., Daniels D.L., 

RA Blattner F.R. ; 

RT "Analysis of the Escherichia coli genome VI: DNA sequence of the 

RT region from 92.8 through 100 minutes."; 



RL Nucleic Acids Res. 23:2105-2119(1995). 

CC -!- SUBCELLULAR LOCATION: Integral membrane protein. Inner membrane 
CC (Potential) . 

CC -!- SIMILARITY: Belongs to the cstA family. 

CC 7~~ 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 
CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 
CC the European Bioinf ormatics Institute. There are no restrictions on its 
CC use by non-profit institutions as long as its content is in no way 
CC modified and this statement is not removed. Usage by and for commercial 
CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; U14003; AAA97251.1; -. 
DR EMBL; AE000506; AAC77310.1; -. 
DR PIR; S56580; S56580. 
DR EcoGene; EG12586; yjiY. 
DR InterPro; IPR003706; CstA. 
DR Pfam; PF02554; CstA; 1. 

KW Hypothetical protein; Transmembrane; Inner membrane; 
KW 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
SQ 

DB 1; Length 721; 
0.13; 

hes 66; Indels 58; Gaps 11; 

Qy 67 GIAVG— LLCQFGLMPFTAYLLAISFSLKPVQAIAVLI MGCCP 107 

111111:11:111 II M Ml 

Db 107 GPLVGPVLAAQMGYLPGTLWLLAGWLAGAVQDFMVLFISSRRNGASLGEMIKEEMGPVP 166 

Qy 108 GGTI SNI FTFWVDGDMDLSI SMTTCSTVAALGMMPLCI YLYTWSWSLQQNLT I P YQNI GI 167 

Ml: |:|:::: I I I I I I : 

Db 167 -GTIALFGCFLI MIIILAVLALIWKALAESP W GV 200 

Qy 168 TLVCLT I PVA — FGVYVNYRWPKQSKI I LKI GAV-VGGVLLLVVAV — AGWLAKGSWNS 222 

I I I : I : I | : | : : | : : I I I I : : I I I : : I I s I 

Db 201 FTVCSTVPIALFMGIYMRFIRPG RVGEVSVIGIVLLVASIYFGGVIAHDPYWGP 254 



Complete 


proteome . 






TRANSMEM 


11 


31 


POTENTIAL. 


TRANSMEM 


36 


56 


POTENTIAL . 


TRANSMEM 


94 


114 


POTENTIAL. 


TRANSMEM 


125 


145 


POTENTIAL. 


TRANSMEM 


169 


189 


POTENTIAL. 


TRANSMEM 


197 


217 


POTENTIAL. 


TRANSMEM 


228 


248 


POTENTIAL. 


TRANSMEM 


263 


283 


POTENTIAL. 


TRANSMEM 


288 


308 


POTENTIAL. 


TRANSMEM 


332 


352 


POTENTIAL. 


TRANSMEM 


380 


400 


POTENTIAL. 


TRANSMEM 


469 


489 


POTENTIAL. 


TRANSMEM 


491 


511 


POTENTIAL. 


TRANSMEM 


537 


557 


POTENTIAL. 


TRANSMEM 


574 


594 


POTENTIAL. 


TRANSMEM 


602 


622 


POTENTIAL. 


TRANSMEM 


674 


694 


POTENTIAL. 


I SEQUENCE 


721 AA; 


77857 MW; D341DB9C 


Query Match 




5.9%; 


Score 117; 


Best Local Similarity 


27.0%; 


Pred. No. 1 


Matches 55; Conservative 


25; Mismatc! 



QY 



223 DITLLTISFIFPLIGHVTGFLLAL 246 



: I : I III: I : I I 
Db 255 ALTFKDTTITFALIGY — AFVSAL 276 



RESULT 12 
YCXE_BACME 

ID YCXE_BACME STANDARD; PRT; 2 86 AA. 

AC P40419; 

DT 01-FEB-1995 (Rel. 31, Created) 

DT 01-FEB-1995 (Rel. 31, Last sequence update) 

DT 16-OCT-2001 (Rel. 40, Last annotation update) 

DE Hypothetical 30.5 kDa protein in gdhl 5 ■ region (ORF 2). 

OS Bacillus megaterium. 

OC Bacteria; Firmicutes ; Bacillales; Bacillaceae; Bacillus. 

OX NCBI_TaxID=1404; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=IAM 1030; 

RA Mitamura T . , Ebora R.V. , Nakai T., Makino Y. , Negoro S-, Urabe I., 

RA Okada H. ; 

RT "Structure of isozyme genes of glucose dehydrogenase from Bacillus 

RT megaterium IAM1030."; 

RL J. Ferment. Bioeng. 70:363-369(1990). 

CC -!- DEVELOPMENTAL STAGE: Expressed during sporulation. 

CC -!- SIMILARITY: TO A SIMILAR ORF IN B.SUBTILIS. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; D90043; BAA14098.1; -. 

DR PIR; 139851; JS0384. 

DR InterPro; IPR004673; RhaT . 

DR TIGRFAMs; TIGR00776; RhaT; 1. 

KW Hypothetical protein; Sporulation. 

SQ SEQUENCE 286 AA; 30490 MW; 95AB89D02511D74D CRC64; 



Query Match 5.9%; Score 116; DB 1; Length 286; 

Best Local Similarity 21.1%; Pred. No. 0.062; 

Matches 64; Conservative 44; Mismatches 91; Indels 104; Gaps 13; 

Qy 66 WG I AVG L L CQ FGLMP FT AYLLAISFSLKPVQAIAVLIMGCCPGGTISNIFTF 117 

III : I I : : I : : I : III I : I : I 

Db 13 WGSIVLFNVKLGGGPYSQTLGTTLGALIFSIGIYIFVHPTFTPLIFGV GWSGL — F 67 

Qy 118 WVDGDMDLSISMTTCSTVAALGMMPLCIYLYTWSWSLQQNLTIPYQNIGITLVCLTIPVA 177 

III::: I I : : I : I I : : 
Db 68 WAVGQ SNQLKSIDLIGVSKTMPI STGLQLVSTSL 101 



Qy 178 FGVYVNYRWPKQS KI I LKI GAWGGVLLLWAVAGWLA KGSWNSDI 224 

I I I I : I : : I II I I I I : : I : I I I I I : : I 

Db 102 FGVIVFHEWSTKTSIIL GVLALIFIIVGIVLASLQSKEEKEAEEGKGNFKKGI 154 



225 TLLTISFIFPL 1 GHVTGFLLALFTHQSWQRCRT IS LETG 263 

: I I I : I I I i I : I I I : : : : : I 

155 VI L L I S T VG Y LVYVWARL FN VD GW S AL L P QAI GMVT GG VL LT FKH K P FN K YAI RN 1 1 P G 214 

264 AQNIQMCITMLQLSFTAEHLVQMLSFP LAYGLFQL IDGFLIVAAYQTYK 312 

I : I :: :| I :: I |: : I :|: : I I 

215 LIWAAGNMFLFISQPKVGVATSFSLSQMGIVISTLGGIIILGEKKT-K 261 

Qy 313 RRL 315 

I : I 

Db 262 RQL 264 



QY 
Db 



RESULT 13 
CYB TOXGO 



ID CYB_TOXGO STANDARD; PRT; 3 68 AA. 

AC 020672; 020928; 

DT 30-MAY-2000 (Rel. 39, Created) 

DT 30-MAY-2000 (Rel. 39, Last sequence update) 

DT 10-OCT-2003 (Rel. 42, Last annotation update) 

DE Cytochrome b. 

GN MTCYB OR COB OR CYTB OR CYB. 

OS Toxoplasma gondii. 

OG Mitochondrion. 

OC Eukaryota; Alveolata; Apicomplexa; Coccidia; Eimeriida; Sarcocystidae ; 

OC Toxoplasma. 

OX NCBI_TaxID=5811; 

RN [11 

RP SEQUENCE FROM N.A. 

RA Toursel C, Tomavo S.; 

RT "Cytochrome B of Toxoplasma gondii. "; 

RL Submitted (JUL-1997) to the EMBL/ GenBank/DDBJ databases. 

RN [2] 

RP SEQUENCE OF 10-368 FROM N.A. 

RC STRAIN-RH; 

RA McFadden D.C., Boothroyd J.C.; 

RT "Cytochrome B gene from Toxoplasma gondii."; 

RL Submitted (SEP-1998) to the EMBL/ GenBank/DDBJ databases. 

CC -!- FUNCTION: Component of the ubiquinol-cytochrome c reductase 

CC complex (complex III or cytochrome b-cl complex) , which is a 

CC respiratory chain that generates an electrochemical potential 

CC coupled to ATP synthesis (By similarity) . 

CC -!- COFACTOR: Binds two heme groups non-covalently . Heme 1 (or BL or 
CC b562) is low-potential and absorbs at about 562, and heme 2 (or BH 

CC or b566) is high-potential and absorbs at about 566 (By 

CC similarity) . 

CC -!- SUBUNIT: The main subunits of complex b-cl are: cytochrome b, 
CC cytochrome cl and the Rieske protein (By similarity) . 

CC -!- SIMILARITY: Belongs to the cytochrome b family. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 



cc 

DR EMBL; AF015627; AAB82741.1; 

DR EMBL; AF023246; AAC34138.1; -. 

DR InterPro; IPR005798; Cytb_b6_C. 

DR InterPro; IPR0057 97; Cytb_b6_N. 

DR Pfam; PF00032; cytochrome_b_C; 1. 

DR Pfam; PF00033; cytochrome_b_N; 1. 

DR PROSITE; PS00192; CYTOCHROME_B_HEME; 1. 

DR PROSITE; PS00193; CYTOCHROME_BjQO; 1. 

KW Electron transport; Mitochondrion; Respiratory chain; Transmembrane; 



KW 


Heme . 








FT 


METAL 


82 


82 


IRON 1 (HEME B562 AXIAL LIGAND) . 


FT 


METAL 


96 


96 


IRON 2 (HEME B566 AXIAL LIGAND) . 


FT 


METAL 


178 


178 


IRON 1 (HEME B562 AXIAL LIGAND) . 


FT 


METAL 


192 


192 


IRON 2 (HEME B566 AXIAL LIGAND) . 


SQ 


SEQUENCE 


368 AA; 


41594 


MW; CC7C6BD3784287CA CRC64; 


Query Match 




5.7 s 


l; Score 112.5; DB 1; Length 368; 



Best Local Similarity 24.6%; Pred. No. 0.15; 



Matches 



57; Conservative 37; Mismatches 101; Indels 37; Gaps 



8; 



Qy 

Db 

Qy 

Db 



84 YLLAISFSLKPVQAIAVLIM GCCPGGTI SNI FTFWVDGDMDLSI SMTTCSTVAALG 139 

:|:|::| I : : I : | :: :: I : II I I 

34 FLVAMTFVLQIITGITLAFRYTSEASCAFASVQHLVREVAAGWEFRMLHATTASFVFLCI 93 

140 MMPLCIYLYTWSWSLQQNLTIPYQNIGITLVCLTIPVAFGVYVNYRWPKQSKIILKIGAV 199 

: : : I I I I : I II : : I : I III II II I : I II 

94 LIHMTRGLYNWSYSY LTTAWMS-GLVLYLLTIATAFLGYV-LPWGQMS FWGAT 144 



Qy 

Db 

Qy 

Db 



200 VGGVLLLWAVAGWLAKG SWN S D I T L LTISFIFPLIGHVTGFLLALFTHQSWQRCR 256 

I II: I I : Ihll : I I I I I : I : I : 
145 VITNLLS PI PYLVPWLLGGYYVSDVTLKRFFVLHFI LPFI GCI 1 1 VLHI FYLHLN 199 

257 TISLETGAQNIQMCITMLQLSF TAEHLVQMLS FPLAYGLFQL 298 

I : I I I : : : I : | : : : I : I I : I 

200 GSSNPAGIDTALKVAFYPHMLMTDAKCLSYLIGLIFLQAAFGLMEL 245 



RESULT 14 
Y944_SYNY3 

ID Y944_SYNY3 STANDARD; PRT; 383 AA. 

AC P74311; 

DT 01-NOV-1997 (Rel. 35, Created) 

DT 01-NOV-1997 (Rel. 35, Last sequence update) 

DT 28-FEB-2003 (Rel. 41, Last annotation update) 

DE Hypothetical protein slr0944. 

GN SLR0944. 

OS Synechocystis sp. (strain PCC 6803) . 

OC Bacteria; Cyanobacteria ; Chroococcales ; Synechocystis. 
OX NCBI_TaxID=1148; 
RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=97061201; PubMed=8905231 ; 

RA Kaneko T., Sato S., Kotani H. , Tanaka A., Asamizu E. , Nakamura Y. , 
RA Miyajima N., Hirosawa M. , Sugiura M. , Sasamoto S., Kimura T . , 
RA Hosouchi T., Matsuno A., Muraki A., Nakazaki N., Naruo K. , Okumura S., 
RA Shimpo S., Takeuchi C, Wada T., Watanabe A. , Yamada M. , Yasuda M. , 





Tabata S. 


; 




RT 


"Sequence 


analysis 


of the genome of the unicellular cyanobacterium 


RT 


Synechocystis sp. strain PCC6803. II. Sequence determination of the 


Kl 


entire genome and assignment of potential protein-coding regions."; 


T>T 

KL 


DNA Res. 


3:109-136(1996) . 


L.U 


-!- SUBCELLULAR LOCATION: Integral membrane protein (Potential). 


LL 

r*r* 
CU 


-!- SIMILARITY: BELONGS TO THE ACR3 FAMILY. 


This SWISS-PROT entry is copyright. It is produced through a collaboration 


CC 


between 


the Swiss 


Institute of Bioinf ormatics and the EMBL outstation - 


cc 


the European Bioinf ormatics Institute. There are no restrictions on its 


CC 


use by 


non-profit 


institutions as long as its content is in no way 


cc 


modified 


and this statement is not removed. Usage by and for commercial 


cc 


entities 


requires a 


license agreement (See http://www.isb-sib.ch/announce/ 


cc 
cc 

DR 


or send an email to 


license@isb-sib. ch) . 


EMBL ; D90914; BAA18405.1; -. 


UK 


PIR; S76146; S76146 




DR 


InterPro; 


IPR004706; Acr3. 


DR 


InterPro; 


IPR002657; BilAc/Na_symport . 


DR 


Pfam; PF01758; SBF; 


1. 


DR 


TIGRFAMs; 


TIGR00832; acr3; 1. 


KW 


Hypothetical protein; Transmembrane; Complete proteome. 


FT 


TRANSMEM 


25 


4 5 POTENTIAL. 


FT 


TRANSMEM 


53 


73 POTENTIAL. 


FT 


TRANSMEM 


103 


123 POTENTIAL. 


FT 


TRANSMEM 


139 


159 POTENTIAL. 


FT 


TRANSMEM 


166 


18 6 POTENTIAL. 


C i 


TRANSMEM 


200 


220 POTENTIAL. 


FT 


TRANSMEM 


238 


2 58 POTENTIAL. 


FT 


TRANSMEM 


272 


2 92 POTENTIAL. 


FT 


TRANSMEM 


309 


32 9 POTENTIAL. 


FT 


TRANSMEM 


332 


352 POTENTIAL. 


SQ 


SEQUENCE 


383 AA; 


42402 MW; 3D8C4CF8EA2FF08B CRC64 ; 



Query Match 5.5%; Score 109.5; DB 1; Length 383; 

Best Local Similarity 20.9%; Pred. No. 0.26; 

Matches 58; Conservative 55; Mismatches 115; Indels 49; Gaps 13; 

Qy 40 VMMGLLMFSLGCSVEIRKLWSHIRRPWGIAVGLLCQFGLMPFTAYLLA ISFSLKPVQ 96 

: : : | : : :: : : : | : : j : ""III "I " • i ' 

Db 62 I C L F FMM Y P I MVK I D F S Q ARQ AVKAP K P VI LT L WN WVT K P FTMVI FAQ F FL G YL FAP L L 121 

Qy 97 AIAVLIMGCCPGGTISNIF TFWVDGDMDLSISMTTCSTVAALGM 140 

: I I I : : I : I I : I III: 

Db 122 TATEIIRG — QEVTIANSYIAGCILLGIAPCTAMVLMW — GYLSYSNQGLTLVMVAVNSL 177 



Qy 141 MPLCIYLYTWSWSL-QQNLTIPYQNIGIT-LVCLTIPVAFGVYVNY RW-PKQ 189 

1:1 II I I I : I : I I : : I : : : I : I I : I I : I I 

Db 178 AMLFLYAPLGKWLLAASNLTVPWQTIVLSVLIWGLPLAAGIYSRYWILKHKGRQWFESQ 237 

Qy 190 S KI I LKI GAWGGVLLLW — AVAGWLAKGS WN S DI TLLT I SFIFPLIGHVTG 241 

I | : | : I I : : I I : : : I |: : : I I I I I : I I 

Db 238 FLHYLSPIAIVALLLTLILLFAFKGELIVNNPLH— IFLIAVPLFIQTNFIF-LITYVLG 294 



Qy 242 FLLAL FTHQSWQRCRT I S LET GAQN I QMCI TMLQL S F 278 

II I : : : I : : : : I : I 



Db 



295 LKLKL SYEDAAPAALIGASNHFEVAIATAVMLF 327 



RESULT 15 
NU5M_AN0QU 

ID NU5M_AN0QU STANDARD; PRT; 576 AA. 

AC P33510; 

DT 01-FEB-1994 (Rel. 28, Created) 

DT 01-FEB-1994 (Rel. 28, Last sequence update) 

DT 01-OCT-1996 (Rel. 34, Last annotation update) 

DE NADH-ubiquinone oxidoreductase chain 5 (EC 1.6.5.3). 

GN ND5 . 

OS Anopheles quadrimaculatus (Mosquito) . 

OG Mitochondrion. 

OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; 

OC Neoptera; Endopterygota ; Diptera; Nematocera; Culicoidea; Anopheles. 

OX NCBI_TaxID=7166; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=Orlando; 

RX MEDLINE=92190510; PubMed=2134 168 ; 

RA Cockburn A.F., Mitchell S.E., Seawright J. A. ; 

RT "Cloning of the mitochondrial genome of Anopheles quadrimaculatus."; 

RL Arch. Insect Biochem. Physiol. 14:31-36(1990). 

CC -!- CATALYTIC ACTIVITY: NADH + ubiquinone = NAD ( + ) + ubiquinol . 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; L04272; AAA93547.1; -. 

DR InterPro; IPR003916; NADHub_oxred5 . 

DR InterPro; IPR001750; Oxidored_ql . 

DR InterPro; IPR001516; Oxidored_ql_N . 

DR Pfam; PF0 0361; oxidoredql; 1. 

DR Pfam; PF00662; oxidored_ql_N; 1. 

DR PRINTS; PR01434; NADHDHGNASE5 . 

KW Oxidoreductase; NAD; Ubiquinone; Mitochondrion. 

SQ SEQUENCE 576 AA; 65913 MW; A82B45D67F430F42 CRC64; 



Query Match 5.4%; Score 107.5; DB 1; Length 576; 

Best Local Similarity 21.6%; Pred. No. 0.56; 

Matches 80; Conservative 52; Mismatches 105; Indels 133; Gaps 21; 

Qy 8 SSACPANSSEEELPVGLEVHGN LELVFTWST VMMGLLMFSLG 50 

Mil: II II: I : I : : I : : I I I I I 

Db 200 S S WL P A- AMAAPT P VS ALVH S S T LVTAGL YLL I RFN I LLT DWWMGQ FML L I S GLTMFMAG 258 

Qy 51 CSV EI RKLWSHI RRPWGI AVGLLCQFGLM P FT AY L LAI S F S L K P VQ AI AVL I 102 

: : : I : I I : I I II I III I hi 

Db 259 LGANFEFDLKKI IALSTLSQLGLMMSILSMGFYKLAFFHLLTHALFKALLF 309 



Qy 



103 MGCCPGGTI SNI FT FWVDGDMDLSISMT-TCSTVAALGM — MPLCI YLYTWSWSLQQ 156 



I I I I I : : | : : | : : | : | | | | : I I I = I : 

Db 310 M--CAGSIIHNMKNSQDIR]yn^GSLSMSMPLTCSCFNVANLALCGMPFIAGFYSKDLILEM 367 

Qy 157 NLTIPYQNI GITLVCLTIPVAFGVYVNYRWPKQSKII 193 

: : : I I : I : I I I : : I I : I : : 

D b 368 -VSLSYVNVFSFFLFFFSTGLT-VCYSFRL VY Y SMT GD FN S S VLH P LN D S GWTML F S 422 

Qy 194 LKI GAWGGVLLLWAVAGWLAKGSW NSDITLLTISFIFPLIGHV 239 

I I I I : I I : I II | : : | | : | : : : I : 

D b 423 I FFLMIMAVI GGSML SWLMFLNP SMI CLP FDLKMLTL- FVC- 1 LGGL 467 

Qy 240 TGFLLALFTHQSWQRCRTISLETGAQNIQMCITMLQLSF TAEHLVQM 286 

|:||: I: : I I I 

Db 4 68 IGYLLS NVS L FFTNKAL Y F YN FT Y FAGSMW FMP WS T I GV 507 

Qy 287 LSFPLAYGLF 2 96 

:::M II: 
Db 508 INYPLKLGLY 517 



Search completed: March 23, 2004, 14:36:18 
Job time : 19 sees 



